One of the “secrets” to providing and maintaining great service integrations is identifying clear technical owners (who may be different than the business owner) from all parties. Whether it be internal or external third party services, having a means to contact the counterpart in the relationship is absolutely necessary.

The owner may or may not be a single person. It could be a team mailing list, but having a known point of contact is essential to great services.

For instance, if a service is expecting to change the API (for breaking changes or just additions), it would be good for the service provider to be able to inform the service consumer. Maybe there’s expected maintainence. Maybe there is now a (better) replacement service. Or maybe there’s different technical requirements (security changes). Maybe the service consumer is hammering the provider with an unexpected amount of traffic.

Of course, a service consumer should be able to contact the service provider for support. Maybe there’s an issue attempting to reach the service. Maybe there’s problems integrating or features that would make life easier.

Having contact information for both the service provider and consumer seems obvious, but often services do not invest in making sure the contacts are the intended audience, ensuring contact details are up-to-date, and the appropriate medium.

Separation of Business vs Technical

Many services fail to distinguish the technical role versus the business role. When you sign up for a third party service account, often you only provide a single email address or phone number. While the person owning the business relationship with the service provider may be technical, there is often someone else who is suppose to own the technical role.

Business owners may not have a clear understanding of technical details. So if the service provider were to contact the service consumer, the business owner may not understand the email and fail to take the appropriate action. For instance, a service provider may state that integrations are required to use a specific version of an SDK. Unless the business owner is technical enough to understand, the news may be lost and a last-minute change may be required by the service consumer. There should be nothing stopping the business owner from also receiving technical information; when adding a technical contact, the intent is not to skip the business owner, but to ensure that information reaches the relevant audience. In a way, it saves the business owner work from forwarding emails.

Ensuring up-to-date contact details

Services should build in an automated system to ensure that contact details are up to date for all relevant parties. Often, technical owners change if not business owners. Unfortunately, many times due to reorgs or other personnel changes, the responsibilities are shuffled and practically no one will/should volunteer to take on more responsibilities without any clear benefits.

For small businesses, a contractor may be involved to develop a new feature using a third party service, but when the contractor is finished with their work, the contractor is no longer the appropriate person to contact. Unfortunately, the service provider is not given the right contact information so any news from the service provider is subsequently lost.

When the contact information is out of date, it leads to unexpected outages, emergency patches, and other stressful events. All because one party could not contact the other.

Whether it be reminding contacts to update their information every few months, having tools to ensure transfer of ownership, or other means, service providers and consumers should be forced to regularly keep their contact information up to date with their respective counterpart.

Appropriate medium for contact

In short, email. While there are many chat programs and services, email is nearly universal for all parties. The communication should not be that often so email is appropriate in many ways.

Each service can decide what is appropriate to contact the other party with (e.g. new API capabilities, SDK versions, requirement changes, or maybe only important emergency notices).

Blogs, GitHub issues, notifications from RSS feeds for version releases, and other mediums are nice, but in many situations neither party can count on them as being monitored by the other party. Email may also be ignored but it is practically the only universal medium where direct contact can be made.

In conclusion, hopefully this post has given some ideas about how to effectively communicate service providers and consumers and why having that relationship be maintained helps ensure a smooth service integration.

Synchronization has been on my mind recently. I have to build a sync engine, primarily on Apple platforms. While the ultimate goal is to make sure that a user can access and modify data on any device, there are many implementation details which have different tradeoffs.

For instance, how do various different devices communicate changes between each other? The general assumption is a network connection will transmit some form of state. Is the state transmitted via HTTP or some other protocol? Is the complete state of the device transmitted or only changes made on the device? Should data be synchronized in real-time or can the data be synchronized with a delay?

Beyond how the state is transmitted, is each device a complete view of the entire data set? Does each device even have the capacity for all of the data? If each device is only a partial view of the data, is there a single source of truth? How does a device determine what is relevant data if it only has a partial view, and how does it obtain the relevant data?

Do devices have constant and reliable network connectivity? If they do not, how does data get merged together when a device cannot communicate? While the device is unreachable, it may make changes and other devices could make changes as well. If there are conflicting changes made, how do the conflicts get resolved?

These questions and decisions are only a few of the considerations which will have major effects on how the synchronization is made.

To clarify which choices to make, there are other desired goals when synchronizing data:

  • The data transmitted to and from a device should only be what is absolutely necessary. Network resources can be fairly scarce especially on mobile devices. Whether on Wi-Fi, on cellular networks, or on wired ethernet, using the network should be considered relatively expensive. Furthermore, any data sent or received requries power to process, so it is better to only process what is necessary.

  • Synchronization data should be batchable. With network resources being considered scarce, the number of network requests should be limited. Therefore, any data changes should be able to be batched together as one network request (whether data is transmitted to or from a device).

  • Devices should be able to receive a partial set of the synchronization data and make progress towards the latest state. If a device received only half of the data updates required to transition the device to the latest state, it should still be able to process the updates received and transition the local device state. The synchronization updates could be large enough either in quantity or size that a device may need to process multiple batches.

  • The processing of synchronization data should be streamable. Whether there is one other device or hundreds of other devices, a device could need to synchronize a massive number of updates made since the last time the device synced. In a mobile device world, the devices have varying degrees of processing power with limited amounts of memory, so the synchronization data needs to be able to be processed as a stream of data versus processing all of the data at once.

With the above desired goals, a few other properties become desirable:

  • The processing of the synchronization data should be idempotent. A device could receive the same synchronization data multiple times. Either the device needs to be able to identify that the updates have already been processed or processing the data should be idempotent. Using specialized data structures such as conflict-free replicated data types may help simplify processing.

  • Any device should be able to track what changes have been made since a local point in relative time. By keeping track of changes, a device can identify what changes were made locally which need to be transmitted to another device. Furthermore, a device could synchronize with multiple devices (or services) which requires multiple points in relative time to be able to be tracked. Note that relative time may not be a traditional clock time but could be as simple as a counter or a change token.

  • As a corollary, a local device should be able to inform another service or device what data has already been synchronized. Whether it is using (change) tokens or other means, a local device can inform another device what it has already processed to reduce the number of requests and data required to get the local device up to date.

  • Any device (including services) which have the data should have ACID transactions. ACID-compliance make synchronization easier by ensuring updates are actually processed when an operation says it is processed.

There are many other considerations for synchronization, but hopefully these thoughts give an idea on what are some of the possible complexities and desired properties when synchronizing data.

In most languages, there is the concept of public, protected, internal, fileprivate, private, and other access control keywords. The intention is to restrict the usage of methods and access to data. In a way, it is the most basic form of encapsulation. After some recent work on a few apps, I’ve come to the conclusion that there are only two forms of access control that should be used in the world: public and internal.

Most of the effort in maintaining these different forms of access control is wasted. If you are in the position to change the code, you can modify the access control from private to public with just a few keystrokes. You may have initially wanted something to be private because you don’t want to accidentially leak the implementation details (even to yourself). Or you don’t trust other code developers working on your project to use the data or methods. However, anyone who has write access to the code can change the access control or make other code changes which break implementation preconditions and invariants. Using private is a small speed bump to preventing bad code.

Instead of maintaining such detailed access control, for libraries, you should use only public and the equivalent of internal if available. internal means any other code in the same module (e.g. package/library) has access to the internal data/function. In the end, libraries have only two forms of access control that anyone cares about. public is for all the consumers of the library. internal is for implementation details that only the library authors should have access to.

Users of your app do not care what the access control is, so let everything be internal. The idea is that while it does not matter today, you may extract code into a re-usable module later. So make things internal in the app, and then go back and expose the required types/methods as public if the code is extracted to a module.

protected, fileprivate, private, and other access control should hardly be used. There have been too many times where I’ve seen people expose the implementation details through leaky abstractions already. Or someone either changes the access control protection to be less restrictive or copies the private code for their uses (which can be even worse). Instead of relying on private or similiar access control levels, it seems to be better to just rely on code reviews and discipline instead. For larger code bases with more than a handful of developers, break the code base into separate modules. Encapsulation should be done at the module/library level versus in every code file across a monolith application.

Most of the time, I do believe that actually codifying the intent behind data/methods into the codebase is a best practice, but restrictive access control is not one of them. The next time that someone (maybe even yourself) changes the access control level of code you work on, try to imagine a world with only two levels of access control.