
you've typed a url thousands of times. hit enter, page loads, move on with your life. but between that keystroke and the rendered webpage, there's an entire symphony of protocols, lookups, and transformations happening in milliseconds.
this is the journey of a url from the moment you press enter to the moment pixels appear on your screen. we'll skip the rabbit holes and focus on what actually matters for understanding the flow.
before we follow the journey, let's understand what we're working with. url stands for "uniform resource locator," and that word "uniform" is doing heavy lifting.
uniform means every url follows the same formal specification (RFC 3986 if you're into that sort of thing):
scheme://authority/path?query#fragment
here's a real example:
https://api.ninefive.com:443/users/1?active=true#profile
let's break it down:
https - the protocol (scheme) we're using to fetch the resourceapi.ninefive.com:443 - where we're connecting (authority = domain + optional port)/users/1 - the path to the specific resource on that server?active=true - optional query parameters#profile - optional fragment (which section of the page to jump to)the authority part (api.ninefive.com:443) can actually omit the port if you're using the default. https://api.ninefive.com:443 is identical to https://api.ninefive.com because 443 is https's default port.
urls are a specific type of uri (uniform resource identifier). the "locator" part means a url answers two critical questions:
compare that to a urn (uniform resource name) like urn:isbn:0471694703. this uniquely identifies a book, but it doesn't tell you where to get it or how to retrieve it. that's the difference between identification (uri/urn) and location (url).
here's where things get interesting. when you type a url with a space - say, searching for "hello world" - your browser needs to send this to a server:
GET /search?q=hello world HTTP/1.1
Host: google.com
but http has a problem. it was designed in 1991 to be human-readable using ascii characters. that's great for debugging, but ascii only supports 128 characters. no arabic, no japanese, no emojis. and crucially, spaces are ambiguous.
look at that request again. where does the url end? is it /search?q=hello or /search?q=hello world? how does the server know where HTTP/1.1 begins?
the solution is percent encoding. your browser automatically converts problematic characters:
// what you type
"https://google.com/search?q=hello world"
// what actually gets sent
"https://google.com/search?q=hello%20world"
the space becomes %20. any character that's not ascii-safe or might be reserved gets encoded. this is why you see those weird % symbols in urls sometimes - it's not broken, it's working exactly as designed.
now you have a properly formatted url: https://api.ninefive.com/users/1. but there's a problem. your browser needs an ip address to establish a tcp connection. domain names like api.ninefive.com don't exist at the network layer - they're a human convenience.
this is where dns (domain name system) comes in. it's basically a distributed phone book that translates domain names to ip addresses.
but before asking the entire internet, your system checks a few local caches:
your browser keeps a cache of recent dns lookups. if you visited api.ninefive.com five minutes ago and the ttl (time to live) hasn't expired, it uses that cached ip address. instant lookup, zero network requests.
if the browser doesn't have it, it asks the os. your operating system maintains its own dns cache. still no network request needed.
before going to the network, the os checks /etc/hosts (on unix-like systems). this is a local file where you can manually map domain names to ip addresses:
# /etc/hosts
127.0.0.1 localhost
192.168.1.100 mydevserver.local
developers use this for local testing. point myapp.local to 127.0.0.1 and you can test your app locally with a real-looking domain name.
if none of the caches have it, your os finally makes a network request to a recursive dns resolver (usually provided by your isp or router).
this resolver doesn't know the answer either. so it walks through the global dns hierarchy:
the resolver caches the result, returns it to your os, which caches it, which returns it to your browser, which caches it. next time you visit, all those steps get skipped until the ttl expires.
now your browser has an ip address: 104.21.82.120. time to connect.
before any http traffic flows, your browser needs to establish a reliable connection using tcp:
this three-way handshake ensures both sides are ready to communicate. it adds latency (roughly one round-trip time), but it guarantees reliability.
since we're using https://, there's an additional step. before sending any actual data, the browser and server need to establish an encrypted connection using tls (transport layer security):
this is why https is slower than http on the first request - there's an extra round trip for the tls handshake. but subsequent requests can reuse the same encrypted connection.
finally, we can send the actual http request. your browser constructs something like:
GET /users/1?active=true HTTP/1.1
Host: api.ninefive.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)
Accept: text/html,application/json
Accept-Language: en-US,en;q=0.9
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
this travels over the encrypted tcp connection to the server.
on the other side, the server:
/users/1 to the appropriate handlerHTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 157
Cache-Control: max-age=300
{
"id": 1,
"name": "marouane",
"active": true,
"email": "boufaroujmarouan@gmail.com"
}
the browser receives this response and:
the #profile fragment from our original url? that only matters now, at the very end. the browser scrolls to the element with id="profile" after rendering.
all of this - dns lookup, tcp handshake, tls negotiation, http request, server processing, http response, parsing, rendering - happens in milliseconds.
next time you type a url and hit enter, you'll know there's an entire choreographed dance happening behind that loading spinner. and maybe you'll appreciate why "the internet is slow" when you're on a bad connection - every one of those round trips adds latency.
this is a simplified version of a complex process. i intentionally skipped details about http/2 multiplexing, dns over https, cdn routing, load balancing, and about a dozen other topics that would turn this into a book. the goal was understanding the main flow, not documenting every possible edge case.