When an API Won't Give You Your Own Data: Building Around a Write-Only System

The Situation Is More Common Than You Think

"We have an API" is one of the most misleading sentences in enterprise software. It usually means "we have an API for the specific inbound integrations we have decided to support," not "you can freely query your own data." The documentation looks comprehensive. The endpoints are listed. It looks like a standard data integration is on the table. Then you try to read data back out, and the picture changes.

This is not a rare edge case. Field service management platforms, job management tools, legacy CRMs, accounting systems, and plenty of vertical SaaS products are in the same position. Their integration story is built around getting data in: syncing contacts from a CRM, creating jobs programmatically, pushing orders from an e-commerce store. For reading data back out, the revenue reports, the order history, the customer records you actually need to feed your reporting, your ad platforms, or your data warehouse, there is effectively nothing.

The consequence is that the data you own, sitting in a system you pay for, is trapped behind a wall the vendor built for their own convenience. You can see it on screen. You can export it as a CSV, if you remember, every morning, before anything else demands your attention. You cannot automate pulling it, because the API that would let you do that does not exist in the direction you need.

If you are evaluating software for data portability

Always ask specifically: "Can I query my job revenue data via API with full filtering and date range control?" A generic "yes, we have an API" from a sales team is not an answer to that question. Get it in writing, and test the endpoint before you build any integration dependency on it. The cost of finding out the API is write-only after you have committed to the platform is much higher than the cost of a half-day evaluation.

The Three Honest Options

When the API will not give you your data, you have three paths. We have used all three, and the right one depends on what the vendor will do for you, what the platform supports natively, and how much engineering you are willing to invest.

Ask the vendor for a private or undocumented endpoint

Sometimes a read endpoint exists but is not in the public documentation. It may have been built for a specific enterprise customer, or it may be available on a higher pricing tier. Ask support directly and specifically: "I need to pull invoiced revenue by date range via API. Is there an endpoint for that, even an undocumented one?" The answer is sometimes yes, and if it is, this is by far the cheapest path. The caveat is that undocumented endpoints can change or disappear without notice, so treat any dependency on one as fragile and build monitoring that alerts you when it breaks.

Build a scheduled export if the platform supports it

Many platforms have a report builder that can email a CSV on a schedule. If yours does, you can automate the downstream half: a Zapier zap or n8n workflow that watches the inbox, extracts the attachment, cleans the data, and routes it to wherever it needs to go. This is the cleanest option when it is available, because the vendor is generating the file and you are only automating the processing. The limitation is that you are constrained by the report builder's format, its scheduling options, and its field coverage. If the report does not include the click ID column you need for ad platform attribution, this path will not get you all the way there on its own.

Automate a headless browser that logs in and extracts the data

When the vendor will not help and the report builder is not enough, the remaining option is to automate what a human would do: open a browser, log in, navigate to the report, set the date range, trigger it, and capture the output. This sounds hacky, and we will be honest about the trade-offs in a moment, but it is often the only reliable path. We have shipped this in production for a real client, and it has run every morning for months without manual intervention. It is the option we will spend the rest of this post on, because it is the one that requires real engineering and the one most businesses do not realise is available to them.

The first two options are worth exhausting before you invest in the third. A phone call to the vendor costs nothing and occasionally solves the whole problem. A scheduled email export plus a Zapier pipeline is a straightforward build. But when neither works, the headless browser approach is the difference between having your data automated and being stuck with a daily manual export that someone eventually forgets to run.

What a Headless Browser Automation Actually Involves

Playwright and Puppeteer are browser automation frameworks that can drive a real browser without a screen. They open a Chromium instance, navigate to a URL, fill in form fields, click buttons, wait for pages to render, and capture the output, all programmatically. From the platform's perspective, the traffic looks indistinguishable from a human user. From your perspective, it is a script that runs on a schedule and produces a clean data file without anyone touching a keyboard.

A typical extraction workflow built this way has five stages:

The Headless Browser Extraction Pipeline

1. Authenticate

The script loads the login page, enters credentials, handles two-factor authentication if present, and waits for the session to establish

2. Navigate to the report

The script clicks through to the reports section, selects the custom report, and sets the correct date range for the data being pulled

3. Trigger and capture the report

The script clicks generate, waits for the report to build, and either downloads the CSV directly or triggers the email export, depending on how the platform delivers it

4. Parse and clean the output

The raw export needs cleaning: currency symbols stripped, field names normalised, partial or duplicate rows removed, and click IDs matched to the correct records

5. Route to downstream systems

The cleaned data is handed to a pipeline (Zapier, n8n, or a custom integration) that feeds it to your reporting dashboard, ad platforms, data warehouse, or accounting system

The first three stages are the browser automation. The last two are a standard data pipeline that you would build regardless of how the data arrived. The browser automation is the part that replaces the human, and it is the part that has to be built carefully, because it is interacting with a user interface that the vendor controls and can change at any time.

Stuck behind a write-only API?

Tell us the platform and what data you need out. We will tell you honestly whether a headless browser workaround is the right call or whether there is a simpler path.

Describe your setup

A Real Example: SingleOps Revenue Extraction

We built exactly this workaround for a home services client, and it has run in production long enough to be a reliable reference rather than a theory. The client ran their field service operations in SingleOps, a job management platform. They were advertising on Google and Microsoft. Jobs were being completed and invoiced in SingleOps. The ad platforms knew about the clicks. SingleOps knew about the revenue. Nothing connected the two.

The goal was to pull the final invoiced job value out of SingleOps, match it to the click ID that originally brought the lead in, and upload it back to Google and Microsoft as an offline conversion so both platforms could calculate real ROAS. Without that connection, every budget decision was based on lead volume rather than actual revenue return, which is a meaningful gap in a business where job values vary widely.

SingleOps did have an API. The documentation existed, endpoints were listed, and it looked like a standard integration was possible. When we dug into it properly, the picture changed. The API was designed entirely for pushing data into SingleOps: syncing contacts, creating jobs, that kind of thing. For reading data back out, the revenue reports we actually needed, there was effectively nothing. The vendor's answer was the export button.

What we built was a Playwright script that runs on a schedule, opens SingleOps in a headless Chromium instance, authenticates with the client's credentials, navigates to the custom reports section, sets the date range to the previous day's jobs, and clicks to generate and email the report. From SingleOps's perspective, this was indistinguishable from a human running their end-of-day report. From the client's perspective, it was something that happened automatically every morning without anyone touching it.

SingleOps sends the report as a CSV email attachment. A Zapier pipeline watches the inbox for emails matching the SingleOps report sender and subject pattern, extracts the attachment, cleans the data, matches the Google Click ID (GCLID) and Microsoft Click ID (MSCLKID) to the correct revenue rows, and uploads the result to both ad platforms as offline conversions. Both platforms then had access to the actual revenue outcome for every click that converted into a completed job, and bidding algorithms could optimise toward real revenue instead of lead volume.

You can read the full case study with the complete pipeline architecture here. The point for this post is not the specific platform. It is that the pattern, a write-only API, a vendor that will not help, and a daily manual export that nobody wants to own, is common, and the workaround is proven.

The result in one sentence

Real ROAS became visible in both Google and Microsoft Ads for the first time, the data refreshed automatically every morning with zero manual exports, and nothing about SingleOps or the client's existing workflows had to change. The solution worked around the platform, not through it.

The Honest Risks and How We Handle Them

A headless browser workaround is not a set-and-forget integration. It interacts with a user interface the vendor controls, and that introduces risks that a proper API integration does not have. We are upfront about all of them, because the decision to build this way should be made with eyes open.

The vendor changes their UI and the script breaks

This is the biggest risk and it is unavoidable. When the vendor redesigns their reports page, moves a button, or changes the login flow, the script's selectors stop matching and it fails. A proper API integration does not have this problem because the API contract is stable. The mitigation is not to pretend the risk does not exist. It is to build failure detection that alerts you immediately when the script breaks, and to structure the script so that fixing a broken selector is a minutes-long job, not a rebuild. We keep the selectors centralised and documented so that a UI change means updating one or two lines, not rewriting the automation.

Session timeouts and authentication failures

Sessions expire. Tokens refresh. Sometimes the login flow throws an unexpected captcha or a "we sent you a code" interstitial that the script was not expecting. The mitigation is to build the script to detect authentication failures, retry once, and alert if the retry fails. We also store credentials securely in a secrets manager rather than hardcoding them, and we handle the common session edge cases (expired cookies, redirect loops, slow-loading auth pages) explicitly rather than hoping the browser waits long enough.

Two-factor authentication

If the platform enforces 2FA on every login, a fully automated script cannot complete the second factor without help. The practical workarounds are: use a session cookie that the script reuses until it expires (most platforms allow sessions of days or weeks), request an API key or service account that bypasses interactive 2FA if the vendor offers one, or schedule the script to run shortly after a human login refreshes the session. None of these are ideal, and if 2FA is strictly enforced with no session persistence, a headless browser approach may not be viable. We will tell you that before we start building rather than discovering it halfway through.

Rate limiting and bot detection

Some platforms detect automated traffic and block it. The mitigation is to run the script at a realistic cadence (once a day, not once a minute), add human-like delays between actions, and use a residential IP or a consistent server IP rather than a known cloud provider range that bot detection flags. We have not encountered blocking on the platforms we have built for, but it is a risk worth naming, and if the platform actively blocks automation, that is a signal to revisit the vendor conversation or the scheduled export option.

The common thread across all four risks is that they are manageable with monitoring and honest engineering, but they do not go away. A headless browser workaround needs someone watching it. That is why we build alerting into every script we ship: if the script fails, the right person knows within minutes, not three weeks later when someone notices the reporting dashboard has not updated. For retainer clients, we monitor the error logs and fix breaks proactively, often before the client notices. The alternative, a manual export that someone forgets to run, has worse failure modes because it fails silently.

Why This Is a Build Job, Not a Config Job

There is no Zapier step for "log into a platform that does not want you to and extract your own data." There is no n8n node for it. There is no Make module for it. The reason is that this is not an integration problem, it is a browser automation problem, and browser automation requires writing code that interacts with a live user interface, handles its edge cases, and degrades gracefully when that interface changes.

This is the line where configuration tools stop and engineering begins. A Zapier zap connects two APIs. A headless browser script drives a browser that is pretending to be a human. The skills are different, the maintenance is different, and the failure modes are different. If someone offers to "set up a Zap" to extract data from a platform with no read API, they have not understood the problem.

What the build actually requires is someone who can write the browser automation in Playwright or Puppeteer, handle authentication and session management, parse and clean the output, wire it into a downstream pipeline (Zapier, n8n, or custom), and build the monitoring and alerting that catches breaks early. It is a combination of frontend automation skills, data engineering, and operational reliability. It is not a large build, but it is a real one, and it is the kind of work that pays for itself quickly when the alternative is a daily manual export that costs an hour of someone's time and breaks the moment they go on holiday.

We build these systems on n8n, Make, and Zapier for the downstream pipeline, and we write the browser automation in Playwright when the upstream system forces it. We also handle the ongoing monitoring, because a headless browser workaround without monitoring is a time bomb. If you are running ads against a platform that will not give you your revenue data, the conversion tracking gap is costing you real money every day the connection stays broken, and the fix is usually faster than people expect once the right approach is identified.

Running Ads Against a Platform That Won't Give You Its Data?

If your job management or CRM software is sitting between your ad spend and your revenue data, there is usually a way to bridge that gap, API or not. Get in touch with a brief of your setup and where the data currently lives. We will tell you honestly whether a headless browser workaround is the right call or whether there is a simpler path you have not tried.

Talk through your setup See automation services

Frequently Asked Questions

Is using a headless browser to extract data legal?

You are extracting your own business data from a platform you pay for, using credentials you control, at a realistic cadence. This is fundamentally different from scraping a third-party site you do not have an account with. That said, you should review the platform's terms of service for any clauses prohibiting automated access. In our experience, vendors are indifferent to a once-daily automated export of data you own, because it is no different in volume or intent from a human doing the same thing. If a vendor's terms explicitly prohibit automation and you are concerned, the right move is to ask them for a proper read API first.

How often does the script break?

It depends on how often the vendor updates their interface. In our experience with the SingleOps extraction, the script has run reliably for extended periods, with occasional breaks when the platform rolls out a UI update. The key is not to prevent breaks entirely, which is impossible against a UI you do not control, but to detect them immediately and fix them fast. We centralise the selectors and document the script structure so that a UI change means updating one or two lines, not rebuilding the automation.

What if the platform enforces two-factor authentication on every login?

If 2FA is strictly enforced with no session persistence, a fully automated headless browser approach may not be viable. The practical workarounds are reusing a session cookie until it expires (most platforms allow sessions of days or weeks), requesting a service account or API key that bypasses interactive 2FA, or scheduling the script to run shortly after a human login refreshes the session. We will assess this before we start building and tell you honestly if the platform's auth model makes automation impractical.

Can I just use Zapier or n8n for this instead of writing code?

No. There is no Zapier step or n8n node for logging into a platform that has no read API and extracting data through its user interface. Zapier and n8n connect APIs. A headless browser drives a browser that is pretending to be a human. The downstream pipeline, cleaning the extracted data and routing it to your ad platforms or reporting, can absolutely use Zapier or n8n. But the extraction step itself requires a Playwright or Puppeteer script written by someone who can handle browser automation, authentication, and the edge cases that live user interfaces produce.

How much does this cost to build?

A headless browser extraction with a downstream pipeline typically falls in the $3,000 to $8,000 range, depending on the complexity of the authentication flow, the number of downstream systems, and the error handling depth required. The SingleOps build included a Playwright script, a Zapier pipeline, data cleaning, and offline conversion uploads to two ad platforms. Ongoing monitoring is available on a retainer basis ($750 to $2,500/month) if you want us watching the error logs and fixing breaks proactively. We will give you a fixed quote before any work starts once we understand your specific platform and data requirements.

Should I just switch to a different platform with a proper API?

Sometimes, yes. If you are early in your relationship with the platform and the data portability gap is a structural problem that will keep costing you, migrating to a platform with a real read API is the cleaner long-term answer. But if you are deeply embedded in the platform, your team knows it, your workflows are built around it, and the only gap is data extraction, a headless browser workaround bridges that gap at a fraction of the cost and disruption of a full platform migration. We will help you assess which side of that line you are on.