AWS Summit SF 2018 Monitoring Survey

Kelsey Hanger | Friday April 20 2018

Moogsoft surveyed attendees at AWS Summit SF 2018 about the IT monitoring challenges they face & the tools they’re using to solve them.

On April 4, 2018, I and 9,000 of my closest IT friends gathered for the AWS Summit SF 2018 at Moscone West. Dr. Werner Vogels, CTO of Amazon, started his keynote by saying that “this a time to learn.”

Indeed it was, and Moogsoft took the opportunity to learn from the market by conducting another Monitoring Survey. Read on to check out what AWS users think of the current state of IT monitoring.

Tweet Section

For an Enterprise, hundreds of alerts seems like a simple matter, but for a team of one or two operators, it could very well be a ‘sea of red.’

Key Findings

  • The top three monitoring tools were Nagios, Splunk & New Relic.
  • The top three monitoring challenges were Alert noise, time to detect and restore, and monitoring coverage.
  • The average level of alert volume per month most commonly cited by survey respondents was in the hundreds.
  • The average number of P1 / SEV-1 incidents per month cited by most of the attendees we surveyed was 0 – 2 per month.
  • On a scale of 1 – 10 — 1 being the most reactive, and 10 being the most proactive — most respondents scored their companies at 7.

Top Monitoring Challenges

Over 50% of AWS attendees said Alert noise is their #1 challenge right now. This is consistent the majority of the other Monitoring Surveys we’ve conducted in the past year (SREcon18, Elastic{ON} 2018, Velocity 2017, Monitorama PDX 2017). This is a strong indication that, regardless of how many AI/ML/Agile/Digital Transformation jargon-filled trends there are in this space, people are still just trying to find a way to part the proverbial sea of red Alerts.

What’s interesting is that 76% of respondents also told us that they are using five or fewer monitoring tools. Since “monitoring coverage” is the #3 most cited Monitoring Challenge of AWS SF, maybe these companies need to invest in a few more tools to get them the coverage they need, while reducing Alert noise.

Over 50% of the attendees we surveyed told us that they only experience 0-2 P1s (wide-scale business impacting problems) every month. This means that, even though Alert noise concerns them, incidents are not regularly affecting their end-users’ experiences.

AWS users, when asked to rank their companies on scale of 1-10 — 1 being the most reactive, 10 being the most proactive — over 24% said their company is fairly proactive (7) when it comes to incident management.

When it comes to the size of AWS environments, 66% of those surveyed told us that they have 0-1,500 servers.

Over 60% of survey respondents said that they experience ‘hundreds’ of Alerts per month.

What makes this survey so perplexing is that, while Alert noise is the #1 monitoring challenge, according to every other stat that our survey returned, it shouldn’t be:

  • Alert volume is only in the hundreds per month
  • Only 0-2 P1s per month
  • Fewer than 1,500 servers

Why is Alert noise still a serious problem?

Here’s a theory: Since these companies have smaller environments, maybe these “Hundreds” of alerts are still too overwhelming for their small ITOps teams. For an Enterprise, hundreds of alerts seems like a simple matter, but for a team of one or two operators, it could very well be a “sea of red.”

AWS Summit SF 2018 Monitoring Survey

Although every legacy tool (HP, IBM, BMC, CA) made the cut in our AWS survey, it’s worth noting that over 80% of those surveyed said that they don’t have an Event Management tool. This makes sense, though, if you go back to the fact that the majority of the people we talked to came from smaller companies  there isn’t necessarily a need for an Event Manager in small IT environments.

New Relic is the #1 APM tool this time around. New Relic has a solid SaaS-only APM tool, so it makes sense that it’s popular among AWS users. AppDynamics usually holds the top APM spot in these surveys, but some say that their SaaS offering is not as strong, and that’s clearly a sticking point with the AWS crowd.

SolarWinds continues to be the top NPM tool. Nothing new.

Over 60% of the respondents told us that they use Nagios. This is consistent with most of our other surveys.

66% of AWS folks told us that they use Splunk as their log management tool. Again, this is consistent with every other survey we’ve conducted.

Well this is a first: Keynote is the #1 synthetic monitoring tool. I’m not sure why this is, but it’s nice to see an underdog take the top spot for once.

Almost 60% of surveyed AWS attendees said that they use email as their main tool to notify the right teams. I can’t imagine a worse way to notify people. I feel bad for these ops team. But then again, this makes  smaller companies don’t have the money to invest in PagerDuty or xMatters, so of course they rely on email.

Jira remains as the #1 ticketing tool at AWS Summit SF 2018.

Slack emerged as the King of Comms yet again.

It’s really no surprise that 95.9% of those surveyed at the AWS Summit SF use AWS as their cloud provider.

AWS Summit SF 2018 Conclusion

AWS Summit SF 2018 had a ton of startups in attendance. But this makes sense  we are in Silicon Valley. So the question is, what will the survey results look like when we attend AWS Summit Chicago? Or AWS Summit LA? These cities don’t have as many startups, so will their environments look different? Stay tuned to see the results.

Moogsoft AIOps helps modern IT Operations and DevOps teams become smarter, faster, and more effective by providing technological supplementation that automates mundane tasks, enables scalability, and frees up human beings to do what they do best — ideate, create, and innovate. Start your free trial today by clicking here.

Kelsey Hanger

About the Author

Kelsey Hanger is a Product Marketing Manager at Moogsoft. When she isn’t writing blogs about AIOps or conducting Monitoring Surveys, she loves finding unique eats in and around SF and traveling to the parts unknown, whether that be a speakeasy in Oakland or the ruins of Monte Albán in Oaxaca, México. Feel free to tweet her @KelsHanger or connect with her on LinkedIn.

See more posts from this author >