background image

Part 03

Cicd Pipeline Design

Speaker 1: Picture manually copying files to dozens of servers after every tiny fix. It's
slow, error-prone, and frankly a bit soul-crushing.
Speaker  2:  Exactly.  Continuous  integration  and  continuous  delivery—CI/CD  for
short—evolved to end that pain. They let us package code changes automatically and
push them to testing or production with a single command.
Speaker 1: Even non-technical teams appreciate the reliability. Instead of wondering if
the latest version actually made it to the customer site, everyone can see the pipeline
status in real time.
Speaker 2: Think of it like an assembly line in a factory. Once the pieces start moving,
the machinery takes care of the rest, freeing people to focus on improving the product
rather than shipping logistics.

background image

Speaker  1:  At  its  core,  a  CI/CD  pipeline  takes  every  change  and  runs  it  through  the
same checklist—build the code, test it thoroughly, then deploy it in a predictable way.
Speaker 2: Automating these steps removes guesswork and inconsistency. The trick is
to  build  the  application  once,  store  that  exact  artifact,  and  promote  it  through  test,
staging, and production environments.
Speaker  1:  Quick  feedback  loops  are  essential.  If  a  test  fails  minutes  after  a  commit,
you can fix the issue before it grows.
Speaker 2: Security scans and quality gates slot neatly into this process as well, so new
features  never  skip  the  basic  health  checks.  The  pipeline  should  become  the  trusted
guardian that approves everything that ships.

background image

Speaker  1:  Once  a  developer  commits  code,  the  pipeline  springs  into  action.  A  clean
environment spins up so any leftover files from previous builds can't cause trouble.
Speaker 2: The code is compiled or packaged, then a suite of automated tests checks
that  nothing  obvious  broke.  If  those  tests  pass,  the  same  package  moves  on  to  a
staging area where more realistic checks occur.
Speaker  1:  Staging  mirrors  the  production  environment  closely.  It's  where  you  might
run smoke tests or manual reviews if needed.
Speaker 2: Only when everything looks good does the pipeline promote the very same
build  to  production.  That  consistent  flow—from  commit  to  build  to  test  to
deploy—means you always know exactly what you're releasing.

background image

Speaker 1: One popular way to run a pipeline is through GitHub Actions. It's basically a
set of instructions written in a simple YAML file that tells GitHub what to do whenever
new code arrives.
Speaker  2:  You  can  define  separate  jobs  for  building,  testing,  and  deploying.  That
separation keeps things tidy and makes it easy to see where a failure happens.
Speaker 1: The workflow can also reuse community-created actions for common tasks
like setting up a programming language or uploading artifacts.
Speaker 2: And because sensitive information is stored in encrypted secrets, you don't
have to hard-code passwords or access keys. It's like giving the pipeline a safe to keep
its credentials locked away while it works.

background image

Speaker  1:  In  the  end,  a  solid  pipeline  is  less  about  fancy  tooling  and  more  about
building trust. Every time code flows smoothly from a developer's laptop to production,
confidence grows across the team.
Speaker 2: Automating the repetitive steps frees everyone to focus on quality and new
ideas. Mistakes still happen, but the pipeline catches many of them before customers
ever notice.
Speaker 1: Think of it as a safety net and a speed booster rolled into one. Once you've
experienced reliable CI/CD, it's hard to imagine going back to manual releases.
Speaker 2: So the real takeaway is simple: invest in your pipeline early. It pays off with
faster  feedback,  fewer  surprises,  and  calmer  midnight  releases—or  better  yet,  no
midnight releases at all.

background image
background image

Devops Sre Platform Careers

Speaker 1: DevOps, SRE and platform engineering roles all grew from the need to ship
software  quickly  without  breaking  things.  Ten  years  ago  you  would  rarely  see  these
titles  outside  of  tech  giants.  If  you  told  someone  you  were  a  DevOps  engineer,  they
might have asked if you fixed printers. Companies like Netflix and Google proved that
automation  and  tight  feedback  loops  were  essential  at  scale,  spawning  whole  teams
devoted to keeping services healthy.
Speaker 2: The roles overlap, yet each leans in a different direction. DevOps engineers
build  the  pipelines  and  smooth  collaboration  between  development  and  operations.
SREs  stand  guard  over  reliability  by  codifying  incidents  and  measuring  service  levels.
Platform  engineers  design  the  internal  tools  and  "paved  roads"  that  keep  everyone
productive.  Many  of  these  professionals  contribute  to  open  source  projects  like
Kubernetes  or  Terraform,  and  remote-first  cultures  allow  them  to  work  across  time
zones.  All  three  paths  pay  well  once  you've  mastered  the  tooling,  though  diversity
challenges remain an industry topic.

background image

Speaker  1:  DevOps  engineers  craft  the  pipelines  that  move  code  from  commit  to
production.  A  normal  day  might  involve  wiring  up  GitHub  Actions  or  Jenkins  jobs,
containerising  apps  with  Docker  and  debugging  a  failed  deployment  at  3 AM.  Their
motto: "It works on my machine" is not an acceptable release strategy.
Speaker 2: Most start as system administrators or developers who discover a knack for
automation. They thrive on curiosity, clear communication and comfort with ambiguity.
Certifications  like  AWS  DevOps  Professional  or  the  Kubernetes  CKAD  help  them
progress.  Junior  salaries  hover  around  $70k  but  can  climb  past  $120k  in  senior  roles,
often  with  remote  work  and  on-call  rotations.  Small  startups  may  have  one  DevOps
generalist, while large enterprises build teams of five or more. Over time you can grow
into architect or engineering manager positions and contribute to open source tooling
along the way.

background image

Speaker 1: Site reliability engineers take system stability to heart. The discipline grew
out  of  Google's  need  to  manage  massive  scale,  so  it's  all  about  automation  and
repeatable operations. An SRE might spend mornings writing Kubernetes operators and
afternoons  running  chaos  experiments  or  tuning  Prometheus  alerts.  When  Netflix
battled early outages, their SREs pioneered chaos engineering to strengthen resilience.
Speaker 2: Most SREs come from software or DevOps backgrounds and are comfortable
coding  and  troubleshooting  production  issues.  They  remain  calm  when  a  3 AM  page
arrives  and  excel  at  communicating  what  went  wrong.  Certifications  such  as  Google's
Professional  Cloud  DevOps  Engineer  bolster  credibility.  Entry  salaries  start  near  $80k,
but  staff  and  principal  SREs  can  exceed  $140k.  On-call  rotations  and  incident
retrospectives  are  part  of  the  culture,  though  many  teams  are  remote  friendly.  Large
companies  maintain  dedicated  SRE  squads,  while  smaller  firms  rely  on  a  handful  of
specialists who obsess over metrics to keep services healthy.

background image

Speaker  1:  Platform  engineers  provide  the  reusable  tooling  that  keeps  teams
productive.  Think  of  them  as  the  people  building  the  roads  and  traffic  lights  so
everyone  else  can  drive  smoothly.  A  platform  engineer  might  design  Kubernetes
templates,  manage  golden  AMIs  or  maintain  an  internal  developer  portal.  Spotify's
Backstage  project  is  a  great  example  of  tooling  that  began  inside  one  company  and
became open source for all.
Speaker  2:  Many  platform  engineers  come  from  DevOps  or  SRE  backgrounds  and
discover they love building shared services more than maintaining single applications.
Empathy  for  developers,  an  eye  for  system  design  and  a  dash  of  product  thinking  all
help.  Salaries  often  start  around  $90k  and  can  reach  $150k  for  senior  architects,
especially if you hold cloud certifications. Larger enterprises may have teams of twenty
or  more,  while  small  start-ups  rely  on  a  single  expert.  Over  time  you  can  grow  into  a
platform  architect  or  product  manager  role.  Remember,  nobody  notices  the  road
builders until there's a pothole!

background image

Speaker 1: Each path offers room to grow from hands-on engineer to strategic leader.
Small  start-ups  might  only  have  one  or  two  DevOps  generalists,  while  mid-sized
companies carve out dedicated SRE and platform teams. Large enterprises can support
whole divisions focused on reliability or internal tooling.
Speaker  2:  Certifications  in  AWS,  GCP  or  Kubernetes  help  you  stand  out,  and
remote-friendly  cultures  mean  geography  is  less  of  a  barrier.  Expect  the  occasional
on-call rotation, but flexible hours are common. Junior roles start around $70k–$90k and
can rise above $150k as you progress to staff or principal levels. These careers reward
people  who  enjoy  problem  solving,  clear  communication  and  continuous  learning.
Diversity  and  inclusion  efforts  are  improving,  yet  more  participation  from
underrepresented  groups  is  still  needed.  Whether  you  specialise  or  blend  all  three
paths, the demand for automation and reliability skills keeps climbing.

background image

Dora Metrics

Speaker 1: Software teams often argue about speed versus stability. The DORA study
cut through that noise by identifying four metrics that predict success.
Speaker  2:  Think  of  them  as  a  health  check  for  your  delivery  pipeline.  When  these
numbers improve, you know your process is maturing without sacrificing reliability.

background image

Speaker 1: Deployment frequency tracks how often you successfully release code. Lead
time  measures  the  journey  from  commit  to  production.  Together  they  show  how
smoothly work flows.
Speaker  2:  Change  failure  rate  looks  at  what  proportion  of  releases  cause  problems.
Mean time to recovery tells you how quickly you can fix things when they break. High
performers excel on all four.

background image

Speaker 1: Numbers alone don't guarantee improvement. Track these metrics over time
and relate them to customer experience and business goals.
Speaker 2: If deployment frequency goes up but failure rate follows, you may need to
slow down and reinforce testing. Balanced metrics drive sustainable velocity.

background image

Speaker 1: The takeaway is simple: measure what matters and act on it. DORA metrics
provide a clear lens on delivery performance.
Speaker  2:  Use  them  to  spark  meaningful  conversations  about  reliability  and  speed.
Continual tracking turns raw data into real improvement.

background image

Github Actions Workflows

Speaker  1:  Remember  the  bad  old  days  of  copying  files  to  a  server  by  hand?  One
forgotten  step  and  everything  broke.  GitHub  Actions  saves  us  from  that  drama  by
automatically running the same steps every time code changes.

Speaker  2:  Think  of  it  like  a  recipe  that  bakes  itself  whenever  someone  adds  a  new
ingredient. Each change triggers a series of pre-defined tasks so that nothing is missed
and everyone can see exactly what happened. Automation isn't just a nice-to-have; it's
the safety net that keeps our future updates smooth and repeatable.
Speaker  1:  Once  you  taste  that  reliability,  you  never  want  to  go  back  to  manual
deployments.

background image

Speaker  1:  A  GitHub  Actions  workflow  is  defined  in  a  YAML  file.  YAML  is  just  a
human-readable list of instructions, like a shopping list written in plain text.

Speaker 2: Each workflow starts with a trigger—maybe someone pushed code, opened
a  pull  request,  or  a  timer  fired.  That  trigger  launches  one  or  more  jobs  which  run  on
"runners," the virtual machines GitHub provides or your own servers.

Speaker  1:  Inside  a  job  you  define  individual  steps.  Those  steps  can  call  reusable
actions from the community or simply run shell commands. Because the workflow lives
next  to  the  code,  it  goes  through  the  same  pull  request  review  process,  making
automation changes visible and safe.

background image

Speaker 1: Picture updating a website by hand—copying files, running tests, and hoping
nothing breaks. With GitHub Actions, those chores become a push-button affair.

Speaker  2:  A  workflow  might  run  unit  tests  on  Windows,  Linux,  and  Mac  all  at  once
using a matrix job. If everything passes, another job bundles the code and deploys it to
your web host automatically.

Speaker 1: You can even have Actions send a Slack message when a build fails, like a
smoke detector for your code. These real-world examples save hours of manual effort
and catch mistakes before customers see them.
Speaker 2: And because the steps are version controlled, you can rewind or tweak them
just like regular code.

background image

Speaker  1:  When  code  is  ready,  another  workflow  job  can  deploy  it.  You  can  require
approvals before the deployment step runs, a bit like needing a manager sign-off before
shipping a product.

Speaker 2: GitHub calls these gated stages "environments." You can limit who can run
them  and  track  every  deployment's  history.  For  cloud  infrastructure,  official  actions
from providers like AWS or Azure handle most of the heavy lifting.

Speaker  1:  Notifications  are  part  of  the  story  too.  A  simple  step  can  send  a  chat
message or email so the team knows exactly when something went live—or if it failed.

background image

Speaker 2: You might be wondering how all this applies to your day job. For help desk
staff, automated tests mean fewer broken releases landing on your queue.

Speaker  1:  If  you’re  in  quality  assurance,  GitHub  Actions  can  spin  up  fresh  test
environments  on  demand  so  you  spend  your  time  testing  features,  not  setting  up
servers.

Speaker  2:  Project  managers  get  clear  visibility  because  each  workflow  run  records
when  and  how  a  release  happened.  And  for  small  businesses,  automation  slashes  the
manual labour that used to require extra headcount.

Speaker 1: Wherever you land in IT, having a trusty robot assistant frees you to focus
on the interesting problems.

background image

Speaker  1:  Keep  each  workflow  job  focused  on  a  single  task  so  failures  are  easy  to
trace. Short jobs also finish faster, giving developers quick feedback.

Speaker  2:  Limit  permissions  wherever  possible  and  rotate  secrets  regularly.  GitHub
lets you grant only the access a job truly needs, which reduces risk if credentials leak.

Speaker 1: Track how long jobs take and watch for any slowdowns. If a five-minute build
suddenly takes twenty, that’s a sign something needs attention.

Speaker  2:  Finally,  reuse  actions  from  the  community  or  your  own  library  to  avoid
reinventing the wheel and keep your automation maintainable.

background image

Speaker 2: When teams treat workflows like code, automation evolves right alongside
the project. Each improvement is recorded in version control so you always know why
something changed.

Speaker  1:  GitHub  Actions  brings  that  automation  directly  into  the  repository.  Build
steps, tests, deployments and notifications all happen in a single, auditable place. You
get faster releases, fewer mistakes and clear visibility for everyone involved.

Speaker 2: Whether you’re supporting a small website or a complex enterprise system,
the key takeaway is simple: start small, keep iterating and let the platform handle the
repetition so you can focus on delivering value.

background image
background image

Sre Error Budgets

Speaker 1: Imagine you promised your app would be available 99.9% of the time. That
still allows roughly nine hours of downtime a year.
Speaker 2: Site Reliability Engineering, or SRE, turns those promises into math we can
track. The key tools are service level objectives, or SLOs, and the error budgets tied to
them.
Speaker 1: By defining how reliable a service must be, we can decide when it's safe to
launch new features and when it's time to pause and fix instability.

background image

Speaker 1: A service level objective is basically a reliability target, like "respond to user
requests within two seconds 99% of the time." It sets clear expectations.
Speaker 2: For non-technical folks, think of an SLO as a promise to customers. If we hit
that goal, users stay happy. If we miss it, they notice glitches or slow pages.
Speaker  1:  We  monitor  these  objectives  continuously  so  we  know  if  the  service  is
drifting away from the acceptable range before customers complain.

background image

Speaker 1: An error budget is the small slice of allowable failure built into an SLO. If our
target is 99.9% uptime, that gives us about forty minutes of downtime each month.
Speaker 2: As outages or performance issues occur, we "spend" that budget. When it's
gone, engineering focuses on reliability work instead of shipping new features.
Speaker  1:  This  approach  keeps  everyone  aligned.  Product  managers  see  how
instability eats into development time, while engineers know exactly when to slow their
release pace.

background image

Speaker 1: Error budgets aren't just for ops teams. They spark conversations about risk
across the organization.
Speaker 2: When the budget burns down faster than expected, it's a sign our releases
might be too risky or our SLO is unrealistic.
Speaker 1: Teams can agree to slow deployments, add more testing, or even adjust the
SLO  if  customer  impact  warrants  it.  The  data  helps  remove  emotion  from  those
decisions.

background image

Speaker  1:  The  real  win  is  shared  language.  SLOs  and  error  budgets  help  teams
quantify acceptable risk instead of arguing about perfection.
Speaker  2:  They  also  create  breathing  room.  When  you're  under  budget,  you  can
innovate quickly. When it's nearly spent, stability becomes the priority.
Speaker 1: By treating reliability like any other feature, SRE practices keep users happy
and developers focused on the right work at the right time.

background image

Trunk Vs Feature Branching

Speaker 1: Modern teams debate whether to work directly on the main branch or use
long-lived feature branches.
Speaker 2: Each method shapes how quickly changes integrate and how much merge
pain you face later.

background image

Speaker  1:  Trunk-based  development  keeps  everyone  committing  to  the  same  main
line.
Speaker 2: Small, incremental changes integrate quickly, and feature flags hide work in
progress.

background image

Speaker  1:  Feature  branching  isolates  each  piece  of  work  so  teams  can  experiment
safely.
Speaker 2: The downside is merges grow complex the longer a branch lives away from
trunk.

background image

Speaker  1:  Teams  often  blend  approaches,  keeping  branches  short  and  integrating
daily.
Speaker  2:  The  goal  is  fast  feedback  with  just  enough  isolation  for  code  review  and
testing.

background image

Speaker  1:  Whether  you  prefer  trunk-based  or  feature  branches,  keep  merges  small
and frequent.
Speaker 2: Continuous integration works best when nothing stays out of the main line
for long.

background image