Monkeynoodle.Org

Tag: Operations

Uncategorized

Sorting Events by Time

I’ve been reviewing the APIs of a number of software vendors lately, looking at how you pull data that they don’t support pushing. It’s producing a bit of flashback to working with ugly things from the old days. Here’s a fun fact, apropos of nothing specific to any current project…

September 9, 2023
Uncategorized

Maintenance Windows and Breakage

Lorin Hochstein recently wrote about normal incidents, “a result of normal work, when everyone whose actions contributed to the incident was actually exercising reasonable judgment at the time they committed those actions.” Instead of an accident or an error, it is an incident which is the outcome of proper behavior.…

September 4, 2023
Uncategorized

Who the Tech Is Meant For

I’m pretty fascinated by the effect of social code matching in product design. In order to market and sell products you have to fit them to the buyer: language, use cases, pricing, packaging, sales motion, and more. In large and small ways, a company’s go to market or an open…

August 20, 2023
Uncategorized

How Do I Drive Remediation SLAs?

Question: I want to get my organization to patch things in a timely fashion, how? Can I just set an SLA (Service Level Agreement) of “patch the criticals in 30 days” and track that? Speaking as a vendor who’s worked with patching systems for everything from big banks and government…

June 18, 2023
Uncategorized

Using a Data Lake for Business Insight

At a former employer some of us used to joke internally that we made the world’s cheapest business intelligence tool and the world’s most expensive log search tool. Business intelligence (BI) use cases are cheap from a data platform perspective, because value and volume are inversely proportional. All the work…

June 11, 2023
Uncategorized

Testing Product in the Field

DevOps: there is no QA, there is no infra, testing and support are everyone’s job. This works okay for unit test level work, but end to end functionality involving multiple teams breaks all the time. You can ask DevOps to take that on too, but they’ll just laugh. You can…

May 21, 2023
Uncategorized

Shewhart Control Charts

As a monitor writer, I want to alert when a value has changed quickly a lot in one direction or another, but i don’t want to set hard-coded thresholds because the value’s range is expected to slowly evolve. My goal is to get useful alerts and avoid false alarms. Examples:…

May 20, 2023
Uncategorized

Uptime nines aren’t equally distributed

Once upon a time, I worked at a hosting company… sadly, after a hardware upgrade gone wrong, the database server behind a customer’s website was sitting open on a data center floor with a cracked motherboard during their launch event. We provided an overall yearly uptime better than three nines…

April 30, 2023
Uncategorized

VMBlog Post on Decentralization

linking to this piece I wrote for VMblog Why Decentralized Work Calls for Decentralized Data

February 26, 2023
Uncategorized

Metrics and Observability

I wrote this as a Twitter thread in March of 2018, but the character constraints of Twitter at that time made it extremely cryptic. Also, it’s staged as a response to Splunk’s introduction of the metrics index… and to be honest, that’s no longer interesting to me. This is an…

January 16, 2023