When to buy, and when to build custom software

Should you buy an off-the-shelf software product? Or should you produce custom software from scratch?

A single answer won’t satisfy the quintessential question in IT. A company that custom-builds everything finds itself a pace behind and with no distinct value add. And on the flip side, businesses that exclusively buy third-party solutions end up spending all their time wrangling generic products.

Building from scratch

Building from scratch (Midjourney)

To really be savvy, approach the “build vs buy” decision with a bit of nuance. How do you evaluate? In this case study, we describe a tact that works. It involves:

  • Assessing the “fit” of off-the-shelf solutions
  • Trying to use existing solutions liberally, but with a custom software backbone
  • Supplementing each 3rd party tool with a migration document; that document clarifies the path to independence from each off-the-shelf solution.

A note on software-as-a-service in 2023

Some quick background on third-party tooling

The explosion of SaaS products over the past decade has made the IT landscape more accessible and competitive. It seems like there’s an enterprise software package for just about everything. SaaS products bring transparency to the world custom software by establishing a cost-per-feature baseline.

In our experience writing custom software integrations between different SaaS products we’ve identified three general risks to consider:

  1. Longevity- third-parties sometimes disappear leaving companies stranded
  2. Lock-in- third-parties sometimes alter their pricing or features on a dime
  3. Fit- sometimes companies misuse off-the-shelf tools resulting in unpredictable system breaks

With that context in mind, let’s turn ahead to how you might apply SaaS products.

Case study: build-vs-buy in the car insurance industry

Industry background

Natural disasters cause massive repair surges and backlogs of insurance claimants swell literally overnight. K-Optional consulted on a particular application that load-balances large influxes of vehicle claims.

Car disaster

We wrote custom software that, among other things:

  • Intakes thousands and thousands of vehicle insurance claims
  • Attempts to schedule a meeting between a claimant and an official repair provider via automated SMS and emails
  • Enables a call-center team to contact claims who don’t reply to automated communication

The application was also meant to schedule appraisals and repairs: our requirements included integrating with YouCanBook.me, a scheduling SaaS product, to save some calendar-management code.

Connecting SaaS with custom software

Here’s a simplified look at part of the initial solution

Custom software sequence

To summarize, this application would communicate with claimants, kick them out to a scheduling link when they made a few selections, and let YouCanBookMe alert the system when an appointment was scheduled.

Early in the release of the first version, we observed the following trade-offs of the YouCanBookMe integration:

YouCanBookMe Integration: Pros

  • Offloading the complexities of time and timezones— there are a lot of them
  • One less user-interface to design and build
  • Free appointment reminder emails

YouCanBookMe Integration: Cons

  • Lack of control with regard to branding
  • Zero validation input: we couldn’t help it if a claimant clicked the browser back button and scheduled twice
  • Domain-modeling divergence

I’ll elaborate on that last point next. All in all, though, the integration was worth it at this point; it protected against scope bloat and helped us get to market quickly.

Rifts in SaaS domain-modeling

Domain-driven design **is an essential principle for engineering software. Essentially, it advocates for modeling software after the industry or the business or the system it serves. Sound obvious? It’s surprisingly not intrinsic to writing custom software.

Think of a software application as an iceberg: the user-facing interface sits above water, dwarfed by the hidden mass of business logic and models below the surface. And failures in domain modeling tend to occur in the submerged majority.

Rifts in domain design; what's under the iceberg

User-facing application above the service, business logic below the surface.

As an example, consider the YouCanBookMe system: the platform refers to “booking calendars” as “profiles”. Profiles are segmented from other profiles and each of them may integrate with its own Google Calendar etc. I presume that this software originally served businesses accepting appointments on behalf of various personnel, i.e. lawyers at a law firm, where “profile” made contextual sense. YouCanBookMe also describes appointments as “bookings”- again, sensible for booking lawyers.

How we used YouCanBookMe in our system diverged from the platform’s internal representations, and therein lies the below-the-surface domain modeling rift.

Repercussions

So, how do breaks in domain models manifest themselves in production code? In ways that you couldn’t easily predict.

Disconnect

Since YouCanBookMe models itself after service-industry calendar management, we had to run against the grain to make it work within our application. For example, YouCanBook.me users pay on a per-profile per-month basis, but we had to spin up multiple calendars overnight for short stints; we were constantly having to adjust our billing plan manually to keep up with storm cycles (simply paying for more than necessary surprisingly didn’t work!). More seriously, little details that the service disregarded happened to matter tremendously to us.

Double bookings

Beware of double-bookings across multiple timezones

Our system would provision a Google Calendar and create a corresponding YouCanBookMe Profile in one job. We learned the hard way that YouCanBookMe synchronizes time-zones up to a few hours after both are created. We would automatically spin up a calendar / booking page for an event, distribute it via SMS, and receive bookings in minutes, only for the time-zone to change on the booking page after people had scheduled. Double bookings would ensue as YouCanBookMe considered all previous appointments in a different time-zone. There was no way for us to set the correct time-zone from the start, forcing us to wait until the two synchronized which palpably weakened the value-add of the scheduler.

I think it’s safe to assume that professional service firm personnel didn’t immediately distribute calendars nor change them that often, an example of how a latent domain divergence may painfully rear its head.

Finally, with a SaaS dependency, you are beholden to API changes. A switch from 1-digit months to 2-digit months in YouCanBookMe’s incoming webhook date format necessitated emergency hot-fixes on our end.

Replacing SaaS integrations with custom software

The thing that made this topology completely untenable was actually a change in Google Calendar’s API. Starting in 2020 or so, an OAuth consent flow became necessary to share Google Calendars on behalf of an account— we had previously used a private service account to fulfill this function.

The system’s dependency on YouCanBook.me meant another dependency on Google Calendar and consequently we were better off shifting towards the “build” strategy.

The pull-request which excised YouCanBook.me and Google Calendar cut a solid 15% of the code-base and immediately improved performance and experience.

Takeaways on using SaaS products vs writing custom software

In the end, integrating with YouCanBook.me helped us launch a product quickly. The troubles came after launch as the system grew and thrived.

This experience has helped us identify a holistic strategy on “build vs buy” & SaaS-stringing:

Use existing solutions where possible under time and resource constraints. Ship with a service-migration assessment.

Documentation

Our service-migration assessments are pretty informal but attempt to estimate the tech-debt associated with a dependency on a SaaS product. We answer questions like:

  • Will this product be around for a while?
  • Is this product’s domain model aligned with our own?
  • What would a migration from this SaaS product look like? What sort of resources would it necessitate?

Since implementing this strategy, we’ve struck a nice balance between launching quickly with minimal resources and ensuring long-term robustness.