Rethinking Your Infrastructure Strategy
A Working Theory
Hey, fellow Leader 🚀,
I am Artur, and welcome to my weekly newsletter. I am focusing on topics like IT Management, Innovation, and Leadership, with an Entrepreneurial mindset. My goal is to help you navigate the IT corporate landscape. Make better decisions, create awareness, and share real-world stories.
It has been a wild ride since I published the first article on the SoW. Exchanging views with you and hearing feedback on how these articles have been useful is what drives my motivation to write. Leave a comment, subscribe to the SoW, and be part of the community.
If this article resonates with you, or you know someone who might find it useful, just share the link!
There is a silent shift that is happening, that even though I don’t want to start doing futurology, I just want to “put out there” for the sake of creating awareness.
The first sign appeared when I was at an event, and during the “drinks” (or networking) part, someone shared an interesting story.
This guy (sorry, I don’t remember his name or the company but it’s not relevant for this story) was creating a product for his startup, and in one night, he saw his monthly budget for Cloud services going down the drain.
Why? He pushed a script to production that got caught in an infinite loop during nightly execution. It consumed the CPU and exhausted the entire monthly budget in one night.
People at the event criticised the lack of guardrails for this kind of situation. But the main message is: Cloud can be very expensive.
Even though I am not an infra guy, the infrastructure costs are something that we care much about, until I noticed a new trend is forming.

Cloud Repatriation
What is Cloud Repatriation?
Nowadays, companies tend to implement cloud-based infrastructures, and using a cloud provider is often the first option they consider.
However, some companies are considering doing the reverse for some projects.
They are moving away from the Cloud due to costs, and the technical term for this movement is “Cloud Repatriation”.
Data centers tend to have a three-to-five-year cycle for their infrastructure lifetime. This means that every few years, the hardware being used in a data center should be replaced by a newer (and hopefully better) version. Demand for new hardware increases, and the cost follows this trend.
Especially due to the enormous infrastructure requirements connected to AI and increased investments planned in the near future. Consequently, hardware costs around the world are becoming more expensive.
It wouldn’t be surprising to see hardware in these datacenters being used until its seven-year run, as a Management cost-effective decision.
Actually, one of the theories in regards of AI bubble sits exactly on this topic. In the sense that AI infrastructure requires an enormous amount of investment in hardware, where its profit margins are still short (Not talking about Nvidia). But I guess this is a topic for a different article.
Why This Is Relevant?
The majority of companies have selected a specific cloud provider (typically AWS or GCP), but many companies and executives didn’t realize how vendor-locked they truly are.
Migrating applications to a public cloud provider is not an easy task, and once it’s done, the last thing executives want to hear is to make another cloud migration.
In financial services, the preferred solution is often building private clouds due to privacy and regulatory constraints. Which puts them in a better position for leveraging vendors during contract negotiations.
However, the AI pressure on hardware will signify one thing: Hardware will get immensely expensive. So costs will make a big part of the decision to continue on a public cloud infrastructure. Especially if a company has already some physical servers lying around.
Another point in a more political nature. Our friends on US have decided to put a very mediatized President who is rocking the boat of international politics. My point here is not to make a political assessment (it is out of the scope of this newsletter), but I would need to make a management assessment.
Business requires stability, and we need to have visibility on what is ahead. Dear Mr. Trump, has started a discussion inside the EU about technological sovereignty. Today, there is no actual European competitor for AWS and GCP, but is a topic where we (at EU) are on alert.
The European Union is slow and bureaucratic, but it is stable. This stability is key for doing business and make long term decisions. Compliance can be a very strong argument for infrastructure repatriation, and the companies that have on-prem infra today have it due to obsolescence or regulations. However, we might see some evolution in the near future, depending on how the political landscape evolves.
Tangible Argument. Cost.
AWS and GCP are the kings of Cloud Services, and we need to pay attention to how their services are priced. The cost of hardware has only one trend that is up. Every company should have processes in place to optimize its cloud usage to become cost-efficient.
However, if a system is data or processing-heavy, it is important to start asking the question: What would be the TCO (Total Cost of Ownership) if I put this system on-prem?
However, besides the cost of the physical hardware, it is important to account for the costs of having dedicated people to maintain it and all the IT security requirements on the TCO.
Looking at the hardware alone, it would compensate financially to move away from the cloud for some data/CPU-intensive systems. Even if hardware for on-prem infrastructure can be depreciated in taxes, it is still an important CAPEX (investment) to take into consideration.
Moving back to on-prem today, in any capacity, is primarily a savings decision rather than a strategic or operational shift. Tomorrow the political landscape could stir things up, and to avoid a costly mistake a rigorous analysis is the only way to validate the move.
I won’t be surprised that Cloud costs would be on everyone’s agenda in the next couple of years, as AI increases its hardware footprint in the global markets. Also, is important to keep an eye on how technological sovereignty evolves in the EU.
If Costs Are Getting Too High, What Are The Solutions?
Let’s make a small shortlist of actions first:
Use AI to understand where the waste is on your infrastructure:
Instances running 24/7 while there is no need for it (Especially Dev and Testing environments)
Oversized VMs or other resources.
Implement cache (Redis, CDNs) systems (Especially over different regions)
Redesign system transfers and storage
If we need to store so much data (typically, teams forget to implement data cleanup and purge scripts on new systems)
Add compression
Identify the main egress points and implement ideas to mitigate
AI is a great help for these assessments. I am not a Cloud Architect, but I have seen wonders in terms of savings based on AI analysis to decrease the costs of Cloud infrastructures.
However, even after all these analyses were done and you need another solution: Go hybrid.
Of course, a TCO is mandatory to make a final decision, but it would be possible to design a system architecture where the back-end would be On-Prem while the front-end (or the light part of the system) would be on a Cloud infra.
The majority of the costs rely on Data, Transfer, and Processing. If the heavy lifting of a system is done in the backend, it would be interesting to have this part on an internal server.
In Hybrid architectures is key to ensure the heavy processes are run on the On-Prem servers. Otherwise, it defeats the purpose of a hybrid architecture.
But most importantly, the egress fees. Meaning the cost charged by a cloud provider for the data that leaves the cloud infrastructure. This is a key aspect for making the hybrid approach work.
If a user downloads a file from a web application, and even if that file is coming from the on-prem servers, if the file passes through the cloud to the user, the company would pay for egress fees. For being cost-efficient, that file should arrive at the user without using any cloud services. Otherwise, the system would defeat the purpose of cost optimization.
Making such an investment would require a detailed and careful financial and technical analysis to decide where to go.
That’s it. If you find this post useful, please share it with your friends or colleagues who might be interested in this topic. If you would like to see a different angle, suggest it in the comments or send me a message.
Cheers,
Artur

