[{"data":1,"prerenderedAt":703},["ShallowReactive",2],{"\u002Fblog\u002Fmonitoring-third-party-apis":3},{"id":4,"title":5,"author":6,"body":8,"category":691,"date":692,"description":693,"extension":694,"image":695,"lastUpdated":695,"meta":696,"navigation":697,"path":698,"readingTime":699,"seo":700,"stem":701,"__hash__":702},"blog\u002Fblog\u002Fmonitoring-third-party-apis.md","How to Monitor Third-Party APIs and External Dependencies",{"name":7},"Vantaj Team",{"type":9,"value":10,"toc":663},"minimark",[11,15,22,27,30,33,36,40,43,48,119,123,181,185,232,236,239,243,249,259,265,271,275,278,300,304,307,332,336,339,359,362,366,369,375,381,387,391,394,403,406,426,429,433,436,439,448,452,455,469,472,475,479,482,621,624,628,632,635,639,642,646,649,653,656,660],[12,13,14],"p",{},"Every application depends on services it doesn't control: payment processors, email providers, authentication services, cloud infrastructure, CDNs, and third-party APIs of every kind. When one of those services degrades, your application often fails in ways that look like your problem - timeouts, 500 errors, broken flows - but the root cause is outside your codebase.",[12,16,17,21],{},[18,19,20],"strong",{},"Third-party monitoring"," is the practice of running external checks against vendors and dependencies so you know about their failures before your users report them to you.",[23,24,26],"h2",{"id":25},"why-you-cant-rely-on-vendor-status-pages-alone","Why You Can't Rely on Vendor Status Pages Alone",[12,28,29],{},"The first instinct when something breaks is to check the vendor's status page. The problem: vendor status pages are notoriously slow to update. Stripe, AWS, Cloudflare, and Twilio have all had incidents where their status pages showed \"All Systems Operational\" while customers were experiencing failures.",[12,31,32],{},"A 2024 analysis of 50 major SaaS outages found that the median time between an incident starting and the vendor's status page reflecting \"investigating\" was 23 minutes. For a payment processor outage, 23 minutes is a significant revenue event.",[12,34,35],{},"External monitoring of your third-party dependencies gives you detection that's independent of what the vendor chooses to report.",[23,37,39],{"id":38},"what-to-monitor","What to Monitor",[12,41,42],{},"Not every third-party integration needs a dedicated monitor, but your critical path dependencies do.",[44,45,47],"h3",{"id":46},"tier-1-revenue-critical-dependencies","Tier 1: Revenue-Critical Dependencies",[49,50,51,67],"table",{},[52,53,54],"thead",{},[55,56,57,61,64],"tr",{},[58,59,60],"th",{},"Dependency",[58,62,63],{},"What to monitor",[58,65,66],{},"Why",[68,69,70,86,97,108],"tbody",{},[55,71,72,76,83],{},[73,74,75],"td",{},"Payment processor (Stripe, Braintree)",[73,77,78,82],{},[79,80,81],"code",{},"https:\u002F\u002Fapi.stripe.com\u002Fv1\u002F"," health endpoint",[73,84,85],{},"Checkout failures are immediate revenue loss",[55,87,88,91,94],{},[73,89,90],{},"Authentication provider (Auth0, Clerk)",[73,92,93],{},"OAuth discovery endpoint",[73,95,96],{},"Auth failures block all user access",[55,98,99,102,105],{},[73,100,101],{},"Email provider (SendGrid, Postmark)",[73,103,104],{},"API status endpoint",[73,106,107],{},"Transactional emails (receipts, resets) stop",[55,109,110,113,116],{},[73,111,112],{},"SMS provider (Twilio, AWS SNS)",[73,114,115],{},"API health endpoint",[73,117,118],{},"Alerts, 2FA, and notifications break",[44,120,122],{"id":121},"tier-2-functional-dependencies","Tier 2: Functional Dependencies",[49,124,125,135],{},[52,126,127],{},[55,128,129,131,133],{},[58,130,60],{},[58,132,63],{},[58,134,66],{},[68,136,137,148,159,170],{},[55,138,139,142,145],{},[73,140,141],{},"CDN (Cloudflare, Fastly)",[73,143,144],{},"Your own domain through the CDN",[73,146,147],{},"Performance degradation reaches all users",[55,149,150,153,156],{},[73,151,152],{},"Search provider (Algolia, Elasticsearch)",[73,154,155],{},"Search API endpoint",[73,157,158],{},"Product search or site search breaks",[55,160,161,164,167],{},[73,162,163],{},"File storage (S3, Cloudinary)",[73,165,166],{},"Storage endpoint or sample file",[73,168,169],{},"File uploads, images, document downloads fail",[55,171,172,175,178],{},[73,173,174],{},"Maps \u002F geocoding (Google Maps, Mapbox)",[73,176,177],{},"API endpoint",[73,179,180],{},"Location features break",[44,182,184],{"id":183},"tier-3-infrastructure-dependencies","Tier 3: Infrastructure Dependencies",[49,186,187,197],{},[52,188,189],{},[55,190,191,193,195],{},[58,192,60],{},[58,194,63],{},[58,196,66],{},[68,198,199,210,221],{},[55,200,201,204,207],{},[73,202,203],{},"DNS provider",[73,205,206],{},"Your authoritative nameservers",[73,208,209],{},"DNS failures take everything offline",[55,211,212,215,218],{},[73,213,214],{},"Certificate provider",[73,216,217],{},"Your certificates' expiry",[73,219,220],{},"Expired certificates block all HTTPS traffic",[55,222,223,226,229],{},[73,224,225],{},"Cloud provider (AWS, GCP, Azure)",[73,227,228],{},"Key regional endpoints",[73,230,231],{},"Availability zone or service issues",[23,233,235],{"id":234},"how-to-set-up-third-party-monitoring","How to Set Up Third-Party Monitoring",[12,237,238],{},"Third-party monitors work exactly like your own HTTP monitors: you point them at a URL, set a check interval, and configure alerts. The differences are in what URL to use and how to interpret failures.",[44,240,242],{"id":241},"finding-the-right-url-to-monitor","Finding the Right URL to Monitor",[12,244,245,248],{},[18,246,247],{},"Use the vendor's own health or status endpoint when available."," Most major APIs expose a lightweight health endpoint that returns a fast 200 without processing a full request:",[250,251,256],"pre",{"className":252,"code":254,"language":255},[253],"language-text","Stripe:      https:\u002F\u002Fapi.stripe.com\u002Fv1\u002F (returns 401, which confirms the API is reachable)\nTwilio:      https:\u002F\u002Fapi.twilio.com\u002F\nSendGrid:    https:\u002F\u002Fapi.sendgrid.com\u002Fv3\u002F\nAuth0:       https:\u002F\u002F[your-tenant].auth0.com\u002F.well-known\u002Fopenid-configuration\nCloudflare:  https:\u002F\u002Fapi.cloudflare.com\u002Fclient\u002Fv4\u002Fuser (returns 400 without auth, confirming reachability)\n","text",[79,257,254],{"__ignoreMap":258},"",[12,260,261,264],{},[18,262,263],{},"If no health endpoint exists, monitor the endpoint your application actually calls."," A lightweight read-only endpoint (GET, not POST) that doesn't require authentication is ideal. If authentication is required, use a test API key with read-only permissions.",[12,266,267,270],{},[18,268,269],{},"Never monitor with write operations."," Don't set up a monitor that creates orders, sends emails, or charges cards. Use GET requests to read-only endpoints.",[44,272,274],{"id":273},"alert-configuration-for-third-party-monitors","Alert Configuration for Third-Party Monitors",[12,276,277],{},"Third-party APIs often have more variable response times than your own infrastructure. Configure your monitors with:",[279,280,281,288,294],"ul",{},[282,283,284,287],"li",{},[18,285,286],{},"Check interval:"," 1–2 minutes (third-party outages can be fast-moving)",[282,289,290,293],{},[18,291,292],{},"Failure threshold:"," 2 consecutive failures (third-party APIs have more transient blips)",[282,295,296,299],{},[18,297,298],{},"Multi-region consensus:"," Required - a network issue between one probe and a CDN edge node should not page your team",[44,301,303],{"id":302},"keyword-assertions-for-third-party-monitors","Keyword Assertions for Third-Party Monitors",[12,305,306],{},"For endpoints that always return a specific response, add a keyword assertion to confirm the API is responding correctly, not just reachable:",[279,308,309,320,329],{},[282,310,311,312,315,316,319],{},"Stripe root endpoint returns ",[79,313,314],{},"{\"error\": {...}}"," - assert on ",[79,317,318],{},"\"object\":\"error\""," or just on status code 401",[282,321,322,323,315,326],{},"Auth0 OpenID configuration returns ",[79,324,325],{},"\"issuer\":",[79,327,328],{},"issuer",[282,330,331],{},"Your CDN-fronted homepage returns your product name - assert on a stable string in your page title",[23,333,335],{"id":334},"handling-expected-non-200-responses","Handling Expected Non-200 Responses",[12,337,338],{},"Many API health endpoints return 4xx status codes when called without authentication:",[279,340,341,347,353],{},[282,342,343,346],{},[79,344,345],{},"401 Unauthorized"," when you haven't provided credentials",[282,348,349,352],{},[79,350,351],{},"400 Bad Request"," when you call without required parameters",[282,354,355,358],{},[79,356,357],{},"403 Forbidden"," for endpoints requiring specific permissions",[12,360,361],{},"This is normal. Set your monitor's expected status code to 401 (or whichever code the vendor returns unauthenticated) instead of 200. What you're checking is not \"does it succeed\" but \"is it responding at all.\"",[23,363,365],{"id":364},"interpreting-third-party-monitor-alerts","Interpreting Third-Party Monitor Alerts",[12,367,368],{},"When a third-party monitor fires, the immediate question is: is this a dependency problem or a network path problem?",[12,370,371,374],{},[18,372,373],{},"Check from multiple regions first."," If only one of your probe locations is seeing failures, the issue is likely a routing problem between that probe and the vendor's edge node - not a vendor outage affecting your users.",[12,376,377,380],{},[18,378,379],{},"Cross-reference with your own application error rates."," If your third-party monitor fires and your error rate on related endpoints is also elevated, it's a real dependency failure. If only the synthetic check is failing but your application looks healthy, it may be a probe-to-vendor routing issue.",[12,382,383,386],{},[18,384,385],{},"Check the vendor's status page and their engineering Twitter\u002FX."," Despite the lag, vendor status pages confirm the scope. A vendor who acknowledges a degradation on social media often does so before updating their status page formally.",[23,388,390],{"id":389},"vendor-monitoring-vs-your-own-downtime-the-distinction","Vendor Monitoring vs. Your Own Downtime: The Distinction",[12,392,393],{},"When a third-party dependency fails, two things are typically true:",[395,396,397,400],"ol",{},[282,398,399],{},"You can't fix it",[282,401,402],{},"Your users are still affected",[12,404,405],{},"The value of third-party monitoring is not immediate remediation - it's fast awareness. With awareness, you can:",[279,407,408,414,420],{},[282,409,410,413],{},[18,411,412],{},"Post an accurate status page update"," explaining that a third-party service is experiencing issues, reducing support ticket volume",[282,415,416,419],{},[18,417,418],{},"Route around the failure"," if you have fallbacks (a backup payment processor, a secondary email provider)",[282,421,422,425],{},[18,423,424],{},"Make architectural decisions"," about which third parties need redundancy",[12,427,428],{},"Teams without third-party monitoring spend the first 15–30 minutes of a third-party incident debugging their own code before concluding the problem is external. Teams with it know within one check interval.",[23,430,432],{"id":431},"third-party-sla-accountability","Third-Party SLA Accountability",[12,434,435],{},"If you have a customer-facing SLA, third-party outages typically do not exempt you from it unless your SLA explicitly excludes them. Most enterprise SLAs include carve-outs for force majeure, but \"our payment processor was down\" usually doesn't qualify.",[12,437,438],{},"This is one argument for redundant critical vendors: if your SLA promises 99.9% uptime, and your payment processor achieves 99.9% uptime, the combined probability of at least one being down at any given time is compounding. Two vendors each at 99.9% reduces your joint exposure.",[12,440,441,442,447],{},"For SLA terms and how to calculate availability across dependent services, see ",[443,444,446],"a",{"href":445},"\u002Fblog\u002Fsli-slo-sla-guide","SLI, SLO, and SLA guide",".",[23,449,451],{"id":450},"building-a-third-party-dependency-inventory","Building a Third-Party Dependency Inventory",[12,453,454],{},"Before you can monitor your dependencies, you need to know what they are. Run through your application and list:",[395,456,457,460,463,466],{},[282,458,459],{},"Every external HTTP call your application makes (check your network logs or APM traces)",[282,461,462],{},"Every SDK or library that makes external calls on your behalf (Stripe SDK, auth libraries, analytics)",[282,464,465],{},"Every infrastructure service that isn't in your primary cloud account",[282,467,468],{},"Every service your background jobs call",[12,470,471],{},"For each dependency, answer three questions: How critical is it? What breaks if it's unavailable? What's our fallback?",[12,473,474],{},"The answers determine monitoring priority and what architectural redundancy makes sense.",[23,476,478],{"id":477},"quick-setup-reference","Quick Setup Reference",[12,480,481],{},"For the most common third-party dependencies:",[49,483,484,501],{},[52,485,486],{},[55,487,488,491,494,498],{},[58,489,490],{},"Vendor",[58,492,493],{},"Monitor URL",[58,495,497],{"align":496},"center","Expected status",[58,499,500],{},"Assertion",[68,502,503,520,538,556,571,589,604],{},[55,504,505,508,512,515],{},[73,506,507],{},"Stripe",[73,509,510],{},[79,511,81],{},[73,513,514],{"align":496},"401",[73,516,517],{},[79,518,519],{},"\"object\"",[55,521,522,525,530,533],{},[73,523,524],{},"Auth0",[73,526,527],{},[79,528,529],{},"https:\u002F\u002F[tenant].auth0.com\u002F.well-known\u002Fopenid-configuration",[73,531,532],{"align":496},"200",[73,534,535],{},[79,536,537],{},"\"issuer\"",[55,539,540,543,548,550],{},[73,541,542],{},"Twilio",[73,544,545],{},[79,546,547],{},"https:\u002F\u002Fapi.twilio.com\u002F",[73,549,532],{"align":496},[73,551,552,555],{},[79,553,554],{},"\"Twilio\""," or status code",[55,557,558,561,566,568],{},[73,559,560],{},"SendGrid",[73,562,563],{},[79,564,565],{},"https:\u002F\u002Fapi.sendgrid.com\u002Fv3\u002F",[73,567,514],{"align":496},[73,569,570],{},"status code",[55,572,573,576,581,584],{},[73,574,575],{},"Cloudflare API",[73,577,578],{},[79,579,580],{},"https:\u002F\u002Fapi.cloudflare.com\u002Fclient\u002Fv4\u002F",[73,582,583],{"align":496},"400",[73,585,586],{},[79,587,588],{},"\"success\"",[55,590,591,594,599,602],{},[73,592,593],{},"AWS us-east-1",[73,595,596],{},[79,597,598],{},"https:\u002F\u002Fs3.us-east-1.amazonaws.com\u002F",[73,600,601],{"align":496},"200 or 403",[73,603,570],{},[55,605,606,609,614,616],{},[73,607,608],{},"GitHub",[73,610,611],{},[79,612,613],{},"https:\u002F\u002Fapi.github.com\u002F",[73,615,532],{"align":496},[73,617,618],{},[79,619,620],{},"\"current_user_url\"",[12,622,623],{},"Check interval: 1 minute. Failure threshold: 2 consecutive. Multi-region: required.",[23,625,627],{"id":626},"frequently-asked-questions","Frequently Asked Questions",[44,629,631],{"id":630},"what-is-third-party-monitoring","What is third-party monitoring?",[12,633,634],{},"Third-party monitoring is the practice of running synthetic checks against external APIs and services that your application depends on. These checks verify that the vendor's service is reachable and responding correctly, independent of the vendor's own status reporting. When a vendor's service degrades, your monitors detect it within one check interval rather than waiting for a user report or vendor acknowledgment.",[44,636,638],{"id":637},"can-i-monitor-stripe-twilio-or-other-vendors-i-dont-control","Can I monitor Stripe, Twilio, or other vendors I don't control?",[12,640,641],{},"Yes. Synthetic HTTP monitoring sends requests to any publicly reachable URL, regardless of who owns it. Most major API providers expose a root endpoint or health endpoint that responds to unauthenticated requests with a predictable status code. Configure your monitor to expect that status code rather than 200, and you have a working third-party dependency check.",[44,643,645],{"id":644},"should-i-use-authenticated-requests-for-third-party-monitors","Should I use authenticated requests for third-party monitors?",[12,647,648],{},"For lightweight availability checks, unauthenticated GET requests to public or discovery endpoints are preferable. They don't consume API rate limits, don't require managing credentials in your monitoring tool, and are faster to set up. Use authenticated requests only when the endpoint you care about doesn't respond without credentials and there's no equivalent public endpoint to check.",[44,650,652],{"id":651},"how-do-i-avoid-rate-limiting-my-third-party-monitors","How do I avoid rate limiting my third-party monitors?",[12,654,655],{},"Use lightweight, read-only endpoints that don't trigger business logic. A check every 1–2 minutes against a root or health endpoint generates at most 1,440 requests per day per monitor, well within the rate limits of any major API provider. Avoid POST endpoints or endpoints that trigger expensive operations on the vendor's side.",[44,657,659],{"id":658},"what-do-i-do-when-a-third-party-monitor-alerts","What do I do when a third-party monitor alerts?",[12,661,662],{},"First, check whether the alert is firing from multiple probe regions or just one. A single-region failure suggests a routing issue, not a vendor outage. If multiple regions confirm the failure, check the vendor's status page and engineering social media for acknowledgment. Post an update to your own status page noting that a third-party service is experiencing issues. Determine if a fallback or manual workaround is available. The vendor's outage is not yours to fix; your job is to communicate accurately and minimize user impact while the vendor resolves it.",{"title":258,"searchDepth":664,"depth":664,"links":665},2,[666,667,673,678,679,680,681,682,683,684],{"id":25,"depth":664,"text":26},{"id":38,"depth":664,"text":39,"children":668},[669,671,672],{"id":46,"depth":670,"text":47},3,{"id":121,"depth":670,"text":122},{"id":183,"depth":670,"text":184},{"id":234,"depth":664,"text":235,"children":674},[675,676,677],{"id":241,"depth":670,"text":242},{"id":273,"depth":670,"text":274},{"id":302,"depth":670,"text":303},{"id":334,"depth":664,"text":335},{"id":364,"depth":664,"text":365},{"id":389,"depth":664,"text":390},{"id":431,"depth":664,"text":432},{"id":450,"depth":664,"text":451},{"id":477,"depth":664,"text":478},{"id":626,"depth":664,"text":627,"children":685},[686,687,688,689,690],{"id":630,"depth":670,"text":631},{"id":637,"depth":670,"text":638},{"id":644,"depth":670,"text":645},{"id":651,"depth":670,"text":652},{"id":658,"depth":670,"text":659},"use-cases","2026-06-29","When Stripe goes down, your checkout breaks. When Twilio fails, your SMS alerts stop. Monitoring external dependencies gives you early warning when a vendor's problems become your users' problems.","md",null,{},true,"\u002Fblog\u002Fmonitoring-third-party-apis",9,{"title":5,"description":693},"blog\u002Fmonitoring-third-party-apis","6mY6WTTpglzLDG8WViB_6UdQZtt8i_kRHmaA6uHazKI",1782464112795]