All Systems Operational
Website   Operational
Log Collection   Operational
Log Search   Operational
Search Alerts   Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Past Incidents
Oct 15, 2018

No incidents reported today.

Oct 14, 2018

No incidents reported.

Oct 13, 2018

No incidents reported.

Oct 12, 2018

No incidents reported.

Oct 11, 2018

No incidents reported.

Oct 10, 2018

No incidents reported.

Oct 9, 2018

No incidents reported.

Oct 8, 2018

No incidents reported.

Oct 7, 2018

No incidents reported.

Oct 6, 2018

No incidents reported.

Oct 5, 2018

No incidents reported.

Oct 4, 2018
Resolved - We are back to normal! If you experience any issues, please reach out to us at support@papertrailapp.com.
Oct 4, 17:01 PDT
Update - We are continuing to monitor for any further issues.
Oct 4, 16:54 PDT
Update - We have absorbed the load spike and systems are back to normal. Delayed search alerts should be sent in the next 5 minutes.
Oct 4, 16:54 PDT
Monitoring - Search queues, alerts, and website performance have returned to normal operation. SRE is continuing to monitor to ensure stability.
Oct 4, 16:53 PDT
Update - After isolating the traffic spike, website performance has improved. Moving to degraded performance.
Oct 4, 16:48 PDT
Identified - We have identified the issue as a load spike against our system, we are working to isolate the misbehaving traffic.
Oct 4, 16:39 PDT
Update - We're seeing issues expand to more components of the serivce. Some users may experience HTTP 500 errors while interacting with Papertrail, including during live tail and log search. Search alerts may be delayed for some users. Provisioning Papertrail as a Heroku add-on may also fail for some users.

SRE is actively engaged and working on a solution.
Oct 4, 16:28 PDT
Investigating - We are investigating an issue that might cause search alerts to be delayed, more info to come.
Oct 4, 15:47 PDT
Oct 3, 2018

No incidents reported.

Oct 2, 2018
Resolved - Some tweaks were made and bolts were tightened. All customers are back to their original positions. We're keeping a watchful eye on this but don't expect this to be an issue going forward.

As always, let support@papertrailapp.com know if anything is out of place.

Thanks for your patience and hanging out with us!
Oct 2, 16:25 PDT
Monitoring - We're putting this under observation, rolling everyone back into normal space in waves, and continuing to monitor for any new leaks.
Oct 2, 16:12 PDT
Update - Engineering is still investigating this. No new news to report.
Oct 2, 13:44 PDT
Update - We don't have any new information to report at this time. We're still investigating the root cause. Papertrail continues to function normally.
Oct 2, 11:55 PDT
Investigating - Less than 1/10th of 1% of users are reporting seeing errant logs in the Events viewer. As a precaution, while we monitor this and track down a possible cause, we've migrated everyone to a different processing pipeline. The side effects include temporarily losing the velocity graphs at the bottom of the page and longer search times. All logs are being ingested and processed properly.

This is absolutely temporary and just a safety precaution at this time. Once we're confident this is no longer an issue, we'll move customers back to where they were. The rest of Papertrail should function entirely the same (alerts, the API, etc.).

Please reach out to support@papertrailapp.com if you have any concerns; we'll be glad to chat.
Oct 2, 09:46 PDT
Oct 1, 2018
Resolved - This incident has been resolved.
Oct 1, 13:09 PDT
Update - Backlogs for 80% of customers have cleared. Continuing to focus on the remaining few.
Oct 1, 12:59 PDT
Identified - The root cause for this degradation was identified and cleaned up.
Oct 1, 12:40 PDT
Update - 50% of customers are now at real-time status. The backlogs for the remaining are continuing to progress nicely.
Oct 1, 12:40 PDT
Update - Verified all logs are being ingested in real-time and without issue. Marking Log Collection as Operational.

About 20% of customers should be back to real-time. The remainder are heading in the right direction toward real-time.
Oct 1, 12:09 PDT
Investigating - The queue has grown a bit to 15 minutes in total size but logs are actively being processed. SRE is continuing to monitor.
Oct 1, 11:39 PDT
Identified - We have identified the cause of the bottleneck and are now burning down the backlog.
Oct 1, 11:29 PDT
Update - Some customers are seeing a delay of 10 minutes for log ingestion and search. Our SRE team is engaged and actively clearing the backlog.
Oct 1, 11:16 PDT
Investigating - We are currently investigating this issue.
Oct 1, 11:12 PDT