A common way to orchestrate these steps is to call API services based on Cloud Functions, Cloud Run or a public SaaS API, e.g. SendGrid, which sends an e-mail with our PDF attachment. But real-life scenarios are typically much more complex than the example above and require continuous tracking of all workflow executions, error handling, decision points and conditional jumps, iterating arrays of entries, data conversions and many other advanced features.
Which is to say, while technically you can use general-purpose tools to manage this process, it’s not ideal. For example, let’s consider some of the challenges you’d face processing this flow with an event-based compute platform like Cloud Functions. First, the max duration of a Cloud Function run is nine minutes, but workflows—especially those involving human interactions—can run for days; your workflow may need more time to complete, or you may need to pause in between steps when polling for a response status. Attempting to chain multiple Cloud Functions together with for instance, Pub/Sub also works, but there’s no simple way to develop or operate such a workflow. First, in this model it’s very hard to associate step failures with workflow executions, making troubleshooting very difficult. Also, understanding the state of all workflow executions requires a custom-built tracking model, further increasing the complexity of this architecture.
In contrast, workflow products provide support for exception handling and give visibility on executions and the state of individual steps, including successes and failures. Because the state of each step is individually managed, the workflow engine can seamlessly recover from errors, significantly improving reliability of the applications that use the workflows. Lastly, workflow products often come with built-in connectors to popular APIs and cloud products, saving time and letting you plug into existing API interfaces.
Workflow products on Google Cloud
Google Cloud’s first general purpose workflow orchestration tool was Cloud Composer.
Based on Apache Airflow, Cloud Composer is great for data engineering pipelines like ETL orchestration, big data processing or machine learning workflows, and integrates well with data products like BigQuery or Dataflow . For example, Cloud Composer is a natural choice if your workflow needs to run a series of jobs in a data warehouse or big data cluster, and save results to a storage bucket.
However, if you want to process events or chain APIs in a serverless way—or have workloads that are bursty or latency-sensitive—we recommend Workflows.
Workflows scales to zero when you’re not using it, incurring no costs when it’s idle. Pricing is based on the number of steps in the workflow, so you only pay if your workflow runs. And because Workflows doesn’t charge based on execution time, if a workflow pauses for a few hours in between tasks, you don’t pay for this either.
Workflows scale up automatically with very low startup time and no “cold start” effect. Also, it transitions quickly between steps, supporting latency-sensitive applications.
Workflows use cases
When it comes to the number of processes and flows that Workflows can orchestrate, the sky’s the limit. Let’s take a look at some of the more popular use cases.
Processing customer transactions
Imagine you need to process customer orders and, in the case that an item is out of stock, trigger an inventory refill from an external supplier. During order processing you also want to notify your sales reps about large customer orders. Sales reps are more likely to react quickly if they get such notifications using Slack.
Here is an example workflow diagram.