There seems to be a general confusion when it comes to understanding the responsiblities of SRE and DevOps. I heard quite a lot of people use the terminologies quite interchangeably.
Thought I'll add a bit of clarity to how these teams are so similar yet different...
SDLC evolved over a period of time and prior to that, we had Development Team (Dev & Testing), Ops Team. The development team was responsible for writing the code, creating value add in the form of new features and then the Ops team took over to get it deployed. There always was confusion with respect to the expectations on how the product/feature should function when it reached the Ops stage. Dev created something and Ops see that thing perform differently than what the customer wanted. This created a lot of wasted cycles going back and forth and shortcomings wrt. the feature, its scalability or performance itself.
To mitigate the dysfunctions of these two teams, the DevOps was proposed in around 2007 where the idea was to form product development teams to include Development, Testing, Documentation & Ops. Each of these teams will create smaller modules that are validated for all sides of the SDLC house to ensure feature adequacy.
As one could see, the silos were eliminated with shared responsibilities. What was requested by the customer is what was getting built, validated prior to the release.
However, for most organizations, this culture shift is a real challenge. There are quite a lot of DevOps consulting organizations now which alludes to the fact that indeed it is challenging to shift the culture. For one, it is still not better understood and many organizations do it differently. For example, it might be hard to find two organizations doing DevOps in a similar fashion.
The second bottleneck is, moving to the present nature of service delivery in a cloud-native way, with microservices, dockers, Kubernetes etc are needs another major shift wrt. the technologies used, build, bundle. I mean, it's not like developers used to have expertise in deployment technologies like containers, service mesh etc. In the DevOps model, a single team is responsible for the end to end delivery of the functionality and it kind of becomes hard and a road with a lot of hurdles.
SRE on the other hand take a middle ground while accepting and understanding the fallout between Development & Ops teams. The Ops team is given an upshift to their roles and breadth of capabilities as well. The way Google put it when they coined the term SRE is, DevOps looks at the 'What' to be done, SRE looks at 'How' it can be done.
For one, both the Development team and Ops team are kept separate but with a shared responsibility of delivery. Each of these teams has high interaction while SRE keeps the measured SLA's to ensure the quality of a product as expected in a production environment. The advantage of this model is there isn't as much culture shift. In simple terms, earlier Ops teams used to spend 90% of their system administration kind of work, in SRE they'll be spending 50% time automating the most mundane repetitive work. They also are responsible for the product to be reliable, scalable and provide feedback to the development team to meet the requirements.
The cross-functional nature of interactions are high in this format since the successful delivery of the product is entrusted with and the responsibility of both teams. While New features are developed by the Development Team, the SRE ensures the reliability, deployability of the same in a production environment. Throughout the different phases, both teams while distinctly separate interact much closer and the skill sets are different with some common as well.
DevOps & SRE are the two sides of the same coin and they seem to do a lot of things similar but there definitely is a distinction with respect to skillsets and culture.