What indicators would suggest to you that an architecture is too complex and not resilient enough to bounce back from a failure?

4.1k viewscircle icon1 Upvotecircle icon8 Comments
Sort by:
CIO in Services (non-Government)10 months ago

Many integrations to legacy "at risk" applications

Many customizations and/or complex developed software components vs. OTS low code utilities

Niche IT player with cutting edge platform

Core underlying structures under business ownership control and authority

Lead Infrastructure Engineer in Finance (non-banking)10 months ago

A few thoughts:
1. A large amount of live data vs backup copy storage.  What do you backup up, why do you back it up and how long do you retain backup copies for?
2. Lack of understanding of dependencies between technologies and applications.
3. Not having verifiable and tested backup copies of key network and security component configurations.
4. A significant segregation of responsibilities in the core network and infrastructure space. 
5. Not requiring and verifying technical recovery plans and backup copies exist and are tested regularly for each product and service.
6. A disconnect between business resiliency goals and the allowed spending to acquire the capabilities to achieve those goals.
7. Business continuity plans not accounting for rebuilding lost data that could not be recovered from backup copies.

These are just a few of the indicators, many times we need to take a step back and remember resiliency is not just about can we do a technical recovery, it is about can we ride out the whole storm and continue to do business.  Without everyone on board, executive leadership, management, system analysts, engineers, and even building services, true business resiliency cannot be achieved.

Director of IT10 months ago

Too many bottlenecks in the architecture which requires specific resources/assets be recovered in a defined sequence for smooth transition to operational state.

Lightbulb on1 circle icon1 Reply
no title10 months ago

Agree - Aligns with too many single points of failure; architecture outstrips support resources (like building a great house on a bad foundation).

Strategic Banking IT advisor in Banking10 months ago

I may sound out of track with my answer, but still, this is a first step in raising the awareness:

A few years ago, we've done a half-day simulation.   Putting all the "normal" people that would be involved if a crisis happened.

The scenario was: a plane crashed into our main datacenter facility.   Everything burned.

Someone got called by the lead manager of incident management.

"How do we restart our primary services (ATM, Branches, Online Banking, Payments, Call Centers, etc.)?" 

Then we watched everyone explaining, at the right moment, the actions to be accomplished.

The conclusion: NO ONE was mastering the END TO END infrastructure required.   For example, everyone assume that IP addresses would have been redirected to our DR site.   Then, after asking questions, someone realize that some manual interventions were required, etc.

At the end: all agreed (including execs) that we could do better on this and that better documentation was required.   At the sametime, we've done work on the architecture itself to increase its resiliency.   Moving to cloud crucial elements is part of this work.

Hope it makes a little sense.

Director of Engineering in Healthcare and Biotech10 months ago

I would add that external dependencies are somethings to consider. We have elite DORA metrics but we lost an industry wide provider this year. One of our competitors just reported a significant financial loss because they were unable to switch providers in a timely manner. 

Content you might like

< 10%26%

10-20%39%

21-30%18%

31-40%7%

41-50%4%

51-60%2%

61-70%

71-80%

> 80%1%

View Results

Under $1m24%

$1m-$10m33%

$11m-$50m32%

$51m-$100m3%

$100m+4%

I do not use cloud computing services.1%

Other (please share below!)

View Results