Learning Objectives

By the end of this module, you will be able to:

  1. Create a repository with git init and understand how it differs from git clone
  2. Explain the three areas — working directory, staging area, and repository — and how data flows between them
  3. Stage changes selectively with git add, commit with git commit, and inspect state with git status and git log
  4. Write well-structured commit messages following the 50/72 convention
  5. Use git diff to compare changes across the three areas
  6. Configure .gitignore to exclude files from tracking

1. Creating a Repository

There are two ways to start working with Git.

git init — Start from Scratch

mkdir my-project
cd my-project
git init

This creates a .git directory inside my-project. The working directory is empty (no commits yet). You're in the "weird initial state" from Module 4 — a branch called main exists conceptually but has no commits.

ls -la .git/
# HEAD, config, objects/, refs/, hooks/, ...

At this point there is no remote. This is a purely local repository.

git clone — Copy an Existing Repository

git clone git@github.com:org/repo.git

This does three things:

  1. Downloads the repository — the entire object database and all branches
  2. Creates a remote called origin — pointing to the URL you cloned from
  3. Sets up tracking — your local main branch tracks origin/main
cd repo
git remote -v
# origin  git@github.com:org/repo.git (fetch)
# origin  git@github.com:org/repo.git (push)

You can rename the directory during clone:

git clone git@github.com:org/repo.git my-local-name

When to Use Which

ScenarioCommand
Starting a brand-new projectgit init
Joining an existing projectgit clone <url>
Starting locally, pushing to GitHub latergit init, then git remote add origin <url>

2. The Three Areas

This is the core mental model for Git's daily workflow. Everything you do with git add, git commit, git status, and git diff involves moving data between three areas.

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   Working Directory          Staging Area          Repository   │
│   (working tree)             (index)               (.git)       │
│                                                                 │
│   Your actual files          A snapshot draft       Permanent   │
│   on disk. Edit freely.      of what will go        commit      │
│                              into the next          history.    │
│                              commit.                            │
│                                                                 │
│        ──── git add ────►          ──── git commit ────►        │
│                                                                 │
│        ◄── git restore ──          ◄── git restore              │
│            (discard)                   --staged ──               │
│                                       (unstage)                 │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Working Directory

The files you see and edit. This is the directory on your filesystem. When you open app.py in your editor and change a line, the working directory changes.

Staging Area (Index)

A holding area between your working directory and the repository. When you run git add, you're copying the current state of a file into the staging area. The staging area is a draft of your next commit.

Why does it exist? Because you might have changed 10 files but only want to commit 3 of them right now. The staging area lets you pick and choose.

Repository

The .git directory — the permanent history. When you run git commit, Git takes everything in the staging area, creates a tree object, creates a commit object pointing to that tree, and updates the branch pointer. The data is now permanently recorded.

The Flow

1. Edit files           → changes appear in working directory
2. git add <files>      → changes move to staging area
3. git commit -m "msg"  → staging area becomes a new commit

Each step is deliberate. You can review at every stage.


3. git status — Understanding Your Current State

git status is the command you'll run most often. It tells you exactly where things stand across the three areas.

git status
On branch main
Your branch is up to date with 'origin/main'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   app.py              ← staged (green in terminal)

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   README.md           ← modified but not staged (red)

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        notes.txt                       ← brand-new file, not tracked (red)

Three sections:

SectionMeaningColor
Changes to be committedIn the staging area, ready for commitGreen
Changes not staged for commitModified in working directory but not yet addedRed
Untracked filesNew files Git has never seenRed

Short Status

git status -s
M  app.py       ← staged modification (left column = staging area)
 M README.md    ← unstaged modification (right column = working directory)
?? notes.txt    ← untracked

The two-column format: left column = staging area status, right column = working directory status.

A  file.txt    ← staged new file
MM file.txt    ← staged AND has further unstaged modifications
D  file.txt    ← staged deletion

4. git add — Staging Changes

git add copies the current state of files from the working directory into the staging area.

Adding Specific Files

git add app.py
git add README.md src/utils.py

Adding Everything

git add .          # add all changes in current directory and below
git add -A         # add all changes in the entire repository

Caution: git add . and git add -A add everything — including files you might not want tracked (logs, credentials, build artifacts). Configure .gitignore first (Section 8).

What "Adding" Actually Does

git add doesn't just mark a file as "to be committed." It takes a snapshot of the file's content at that moment and writes it into the staging area.

This has an important consequence:

echo "version 1" > file.txt
git add file.txt                # snapshot of "version 1" is staged
 
echo "version 2" > file.txt    # working directory now has "version 2"
 
git status
# Changes to be committed:
#     new file:   file.txt           ← "version 1" is staged
# Changes not staged for commit:
#     modified:   file.txt           ← "version 2" is in working dir

If you commit now, you commit "version 1" — the version that was staged. To commit "version 2", you'd need to git add file.txt again.

Removing Files from Staging

git restore --staged app.py     # unstage (keeps working directory changes)

Older syntax (still works):

git reset HEAD app.py           # same effect

Discarding Working Directory Changes

git restore app.py              # revert file to the staged/committed version

Warning: git restore without --staged discards your working directory changes permanently. There is no undo.


5. git commit — Recording a Snapshot

git commit takes everything in the staging area and creates a permanent commit object.

With an Inline Message

git commit -m "Add user authentication"

With Your Editor

git commit

Git opens your configured editor (from core.editor). You write a message, save, and close. If you close without writing a message (or leave it empty), the commit is aborted.

Combining Add and Commit

git commit -a -m "Fix login bug"
# or
git commit -am "Fix login bug"

The -a flag automatically stages all modified tracked files before committing. It does NOT add untracked (new) files — those still need an explicit git add.

What Happens Behind the Scenes

When you run git commit:

  1. Git reads the staging area (index)
  2. Creates a tree object representing the project snapshot
  3. Creates a commit object with the tree, parent, author, and message
  4. Moves the current branch pointer to the new commit
Before:                          After:

○ ◄── ○ ◄── ○                   ○ ◄── ○ ◄── ○ ◄── ○
C1     C2     C3                 C1     C2     C3     C4
               ▲                                       ▲
              main                                    main
               ▲                                       ▲
              HEAD                                    HEAD

Amending the Last Commit

Made a typo in your commit message? Forgot to stage a file?

git add forgotten-file.py
git commit --amend -m "Corrected commit message"

This replaces the last commit with a new one (different hash). Only do this for commits that haven't been pushed.


6. Writing Good Commit Messages

Commit messages are documentation for your future self and your teammates. They answer the question: why was this change made?

The 50/72 Rule

Summary line — max 50 characters (imperative mood)
                                                  |← 50 char mark
<blank line>
Body — wrap at 72 characters per line. Explain what
changed and why, not how (the diff shows how). Can
be multiple paragraphs.
                                                                        |← 72 char mark

The Format

Add user authentication via JWT tokens

The application previously had no authentication. Users
could access all endpoints without credentials.

This commit adds:
- JWT token generation on login
- Middleware to verify tokens on protected routes
- Token expiry and refresh logic

Resolves: JIRA-1337

Rules

  1. Summary line: imperative mood, max 50 characters

    • Good: Add password validation
    • Good: Fix null pointer in user lookup
    • Bad: Added password validation (past tense)
    • Bad: Adding password validation (gerund)
    • Bad: This commit adds password validation to the login form when the user submits their credentials (too long)
  2. Blank line between summary and body — Tools display the summary as a title. Without the blank line, the body gets concatenated.

  3. Body: wrap at 72 characters — Many tools display commit messages in fixed-width contexts (terminal, email, git log). 72 characters ensures readability.

  4. Explain why, not what — The diff shows what changed. The message should explain the reasoning, context, or problem being solved.

  5. Reference tickets/issues — If your team uses Jira, GitHub Issues, or similar, include the ticket number.

What Your Editor Shows

When Git opens your editor for a commit message, you'll see commented lines with context:

# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# On branch feature-auth
# Changes to be committed:
#       modified:   src/auth.py
#       new file:   src/middleware.py
#

Everything after # is ignored. It's there to remind you what you're committing.


7. git log — Viewing History

git log shows the commit history of the current branch.

Basic Log

git log

Shows full commit details: hash, author, date, full message. Press q to exit, j/k or arrow keys to scroll.

One-Line Format

git log --oneline
# a1b2c3d Add user authentication
# e4f5a6b Fix login page layout
# 7c8d9e0 Initial commit

With Graph

git log --oneline --graph --all

Shows branches and merge structure as ASCII art. Add --decorate to show branch/tag labels (default in modern Git).

Limiting Output

git log -5                    # last 5 commits
git log --since="2024-01-01"  # commits since a date
git log --author="Jane"       # commits by author (substring match)
git log -- path/to/file       # commits that touched a specific file

We'll explore git log in depth in Module 15.

git show — Inspect a Single Commit

git show              # show the latest commit (diff + metadata)
git show a1b2c3d      # show a specific commit
git show HEAD~2       # show 2 commits ago

8. git diff — Comparing Changes

git diff shows the exact line-by-line differences between versions of files. It operates across the three areas.

Working Directory vs. Staging Area

git diff

Shows what you've changed but haven't staged yet. If you've staged everything, this shows nothing.

Staging Area vs. Last Commit

git diff --staged
# or equivalently:
git diff --cached

Shows what's been staged and will go into the next commit.

Working Directory vs. Last Commit

git diff HEAD

Shows all changes (staged and unstaged) compared to the last commit.

The Three Comparisons Visualized

    Last Commit        Staging Area        Working Directory
    (repository)       (index)             (your files)
         │                  │                     │
         │   git diff       │   git diff          │
         │   --staged       │   (no flag)         │
         │◄─────────────────│◄────────────────────│
         │                                        │
         │         git diff HEAD                  │
         │◄───────────────────────────────────────│

Reading Diff Output

diff --git a/README.md b/README.md
index 8ab686e..f1e2d3c 100644
--- a/README.md
+++ b/README.md
@@ -1,3 +1,4 @@
 # My Project
 
-This is the old description.
+This is the new description.
+Added another line.
ElementMeaning
--- a/README.mdThe "before" version
+++ b/README.mdThe "after" version
@@ -1,3 +1,4 @@Starting at line 1, the old version had 3 lines; the new version has 4
Lines starting with -Removed lines (red in terminal)
Lines starting with +Added lines (green in terminal)
Lines with no prefixContext (unchanged lines for reference)

Diffing Specific Files

git diff README.md                 # unstaged changes in one file
git diff --staged app.py           # staged changes in one file
git diff HEAD -- src/              # all changes in a directory vs last commit

We'll cover advanced diff usage (two-dot, three-dot, word-diff, external tools) in Module 14.


9. .gitignore — Excluding Files

Some files should never be tracked by Git: build artifacts, dependency directories, IDE settings, credentials, OS junk files. The .gitignore file tells Git to ignore them.

Creating a .gitignore

Create a file called .gitignore in the root of your repository:

# Dependencies
node_modules/
vendor/
venv/
__pycache__/
 
# Build output
dist/
build/
*.o
*.pyc
*.class
 
# IDE / Editor
.idea/
.vscode/
*.swp
*.swo
*~
 
# OS files
.DS_Store
Thumbs.db
 
# Environment / secrets
.env
.env.local
*.pem
credentials.json
 
# Logs
*.log
logs/

Pattern Syntax

PatternMatches
*.logAny file ending in .log in any directory
build/The build directory and everything in it
/build/Only the build directory at the repository root
doc/*.txtdoc/notes.txt but not doc/sub/notes.txt
doc/**/*.txtdoc/notes.txt and doc/sub/notes.txt
!important.logNegate a rule — track important.log even though *.log is ignored
#Comment line

.gitignore Applies Only to Untracked Files

If a file is already tracked (was committed before you added it to .gitignore), ignoring it won't remove it from the repository. You need to explicitly untrack it:

# Stop tracking a file but keep it on disk
git rm --cached credentials.json
echo "credentials.json" >> .gitignore
git commit -m "Stop tracking credentials file"

Nested .gitignore Files

You can place .gitignore files in subdirectories. Rules in a subdirectory .gitignore apply to that directory and its children. Rules in deeper .gitignore files can override rules from parent directories.

Global .gitignore

For files that are specific to your machine (not the project) — like .DS_Store on macOS or Thumbs.db on Windows:

git config --global core.excludesFile ~/.gitignore_global

Then create ~/.gitignore_global:

.DS_Store
Thumbs.db
*.swp
*~

This applies to all repositories on your machine without polluting each project's .gitignore.

Checking Why a File Is Ignored

git check-ignore -v file.txt
# .gitignore:3:*.txt    file.txt
#             ↑ line 3 of .gitignore matched

Command Reference

CommandDescription
git initCreate a new repository in the current directory
git clone <url>Clone a remote repository
git clone <url> <dir>Clone into a specific directory name
git statusShow the state of the three areas
git status -sShort-format status
git add <file>Stage a specific file
git add .Stage all changes in current directory and below
git add -AStage all changes in the entire repository
git restore --staged <file>Unstage a file (keep working dir changes)
git restore <file>Discard working directory changes (destructive!)
git commit -m "message"Commit staged changes with inline message
git commitCommit staged changes (opens editor for message)
git commit -am "message"Stage tracked files and commit in one step
git commit --amendReplace the last commit (new message and/or files)
git logShow commit history
git log --onelineCompact one-line history
git log --oneline --graph --allASCII graph of all branches
git log -nShow the last n commits
git showShow the latest commit details and diff
git show <hash>Show a specific commit
git diffDiff: working directory vs. staging area
git diff --stagedDiff: staging area vs. last commit
git diff HEADDiff: working directory vs. last commit
git rm --cached <file>Untrack a file (keep it on disk)
git check-ignore -v <file>Show which gitignore rule matches a file

Hands-On Lab: The Complete Basic Workflow

This lab walks through the entire daily Git workflow: init, add, commit, diff, log — building a small Python project.

Setup

mkdir ~/git-workflow-lab
cd ~/git-workflow-lab
git init

Checkpoint:

git status
# On branch main
# No commits yet
# nothing to commit (create a copy if you want to use git add to track)

Step 1: Create Initial Files

cat > app.py << 'EOF'
def add(a, b):
    return a + b
 
def subtract(a, b):
    return a - b
 
if __name__ == "__main__":
    print(f"2 + 3 = {add(2, 3)}")
    print(f"5 - 2 = {subtract(5, 2)}")
EOF
 
cat > README.md << 'EOF'
# Calculator
 
A simple calculator application.
EOF

Step 2: Observe the Untracked State

git status

Both files appear as Untracked files (red). Git knows they exist but isn't tracking them.

git status -s
# ?? README.md
# ?? app.py

The ?? means untracked.

Step 3: Stage and Commit

git add README.md app.py
git status

Both files are now under Changes to be committed (green).

git status -s
# A  README.md
# A  app.py

The A in the left column means "added to staging area."

git commit -m "Add calculator app with add and subtract"

Checkpoint:

git log --oneline
# a1b2c3d Add calculator app with add and subtract
 
git status
# On branch main
# nothing to commit, working tree clean

Clean state — working directory, staging area, and repository all match.

Step 4: Make Changes and Observe the Diff

Edit app.py to add a multiply function:

cat > app.py << 'EOF'
def add(a, b):
    return a + b
 
def subtract(a, b):
    return a - b
 
def multiply(a, b):
    return a * b
 
if __name__ == "__main__":
    print(f"2 + 3 = {add(2, 3)}")
    print(f"5 - 2 = {subtract(5, 2)}")
    print(f"4 * 3 = {multiply(4, 3)}")
EOF

Now check the diff:

git diff

You should see the multiply function and the new print statement as + lines (green).

git status -s
#  M app.py        ← M in right column = unstaged modification

Step 5: Stage and See the Diff Shift

git add app.py
git diff              # nothing — no unstaged changes
git diff --staged     # shows the multiply changes — they're staged now
git status -s
# M  app.py        ← M in left column = staged modification

Step 6: Make More Changes After Staging

Edit app.py again — add a divide function:

cat >> app.py << 'EOF'
 
def divide(a, b):
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b
EOF

Now check:

git status -s
# MM app.py        ← M in BOTH columns: staged AND unstaged changes
 
git diff              # shows the divide function (unstaged)
git diff --staged     # shows the multiply function (staged)
git diff HEAD         # shows BOTH multiply and divide (all changes vs last commit)

This is the three-way state in action. app.py exists in three different versions right now:

  • Repository: without multiply or divide
  • Staging area: with multiply, without divide
  • Working directory: with both multiply and divide

Step 7: Commit Only What's Staged

git commit -m "Add multiply function"

Checkpoint:

git status -s
#  M app.py        ← divide function is still unstaged
 
git log --oneline
# b2c3d4e Add multiply function
# a1b2c3d Add calculator app with add and subtract

The multiply function was committed. The divide function is still only in the working directory.

Step 8: Stage and Commit the Rest

git add app.py
git commit -m "Add divide function with zero-division guard"

Checkpoint:

git log --oneline
# c3d4e5f Add divide function with zero-division guard
# b2c3d4e Add multiply function
# a1b2c3d Add calculator app with add and subtract
 
git status
# nothing to commit, working tree clean

Step 9: Set Up .gitignore

# Create some files that shouldn't be tracked
mkdir __pycache__
echo "cached bytecode" > __pycache__/app.cpython-311.pyc
echo "DEBUG=true" > .env
echo "random notes" > notes.tmp
git status -s
# ?? .env
# ?? __pycache__/
# ?? notes.tmp

All three show up as untracked. Create a .gitignore:

cat > .gitignore << 'EOF'
__pycache__/
*.pyc
.env
*.tmp
EOF

Checkpoint:

git status -s
# ?? .gitignore       ← only .gitignore shows up now!

The other files are invisible to Git.

git check-ignore -v .env
# .gitignore:3:.env    .env
 
git check-ignore -v notes.tmp
# .gitignore:4:*.tmp    notes.tmp

Commit the .gitignore:

git add .gitignore
git commit -m "Add gitignore for Python artifacts and secrets"

Step 10: Intentional Mistake — Commit a Bad Message

echo "# Version 1.0" >> README.md
git add README.md
git commit -m "stuff"

That's a terrible commit message. Fix it:

git commit --amend -m "Update README with version number"

Checkpoint:

git log --oneline
# (new hash) Update README with version number    ← fixed message, different hash
# c3d4e5f Add divide function with zero-division guard
# b2c3d4e Add multiply function
# a1b2c3d Add calculator app with add and subtract

Notice the hash changed — --amend creates a new commit (immutability!).

Step 11: View the Full History

git log

Read through each commit — author, date, full message. Press q to exit.

git show HEAD~2    # show the multiply commit (2 commits ago)

Challenge

  1. Add a tests.py file with some test functions. Create two separate commits: one for add/subtract tests, one for multiply/divide tests. Use git add selectively (don't use -A).

  2. After committing everything, run git diff HEAD~2..HEAD to see all changes across the last two commits.

  3. Create a file called secrets.txt, accidentally commit it, then remove it from tracking (using git rm --cached) and add it to .gitignore. Verify with git status that the file is on disk but no longer tracked.

Cleanup

rm -rf ~/git-workflow-lab

Common Pitfalls & Troubleshooting

PitfallExplanation
git add . includes unwanted filesAlways set up .gitignore before your first git add .. If you accidentally committed secrets or large binaries, see Module 11 (git reset) or use git rm --cached.
"Nothing to commit" after editingYou edited files but didn't run git add. Changes must be staged before they can be committed. Or use git commit -a for tracked files.
Commit message editor is confusingIf Vim opens and you're stuck: type i to enter insert mode, write your message, press Esc, type :wq to save and quit. Or change your editor: git config --global core.editor "code --wait".
git commit -a doesn't add new filesThe -a flag only stages modified tracked files. Brand-new files are untracked and must be added explicitly with git add.
Forgetting to stage after further editsIf you git add file.py, then edit file.py again, the second edit is NOT staged. You need to git add again to include the latest changes.
.gitignore doesn't work for already-tracked files.gitignore only affects untracked files. To stop tracking a file: git rm --cached <file>, then add it to .gitignore, then commit.
Summary line too longKeep it under 50 characters. GitHub, git log --oneline, and many tools truncate longer summaries with .... Move details to the body (after a blank line).

Pro Tips

  1. Check status obsessively. Run git status before and after every git add and git commit. It's your dashboard. Develop the reflex.

  2. Stage intentionally. Don't default to git add . once you're past the beginner stage. Staging specific files (or even specific hunks with git add -p, covered in Module 12) lets you craft clean, logical commits from messy work sessions.

  3. One logical change per commit. A commit should represent one coherent change — "Add login form" or "Fix email validation bug." Don't bundle unrelated changes. Don't split one change across multiple commits (unless it's genuinely large and can be broken into independent pieces).

  4. git diff --staged before every commit. Review what you're about to commit. This catches accidental debug lines, stray whitespace, and forgotten files.

  5. Use templates for commit messages. If your team has a format (e.g., including Jira ticket numbers), configure a template:

    git config --global commit.template ~/.gitmessage

    Then create ~/.gitmessage:

    # [TICKET-XXX] Summary (max 50 chars)
    
    # Why is this change needed?
    # What does it do?
    
  6. Start your .gitignore from a template. GitHub maintains a collection at github.com/github/gitignore with templates for Python, Node.js, Java, Go, and more. Start with one of these and customize.

  7. Alias git status if you type it a lot:

    git config --global alias.st status
    git config --global alias.s "status -s"

    Now git s shows short status.


Quiz / Self-Assessment

1. What are the three areas in Git's workflow model?

Answer
Working directory (your files on disk), staging area / index (a draft of the next commit), and repository (the .git directory containing permanent commit history).

2. What does git add actually do?

Answer
It copies the current state of the specified files from the working directory into the staging area. It takes a snapshot of the file at that moment — further edits to the file are not automatically included.

3. What's the difference between git diff and git diff --staged?

Answer
git diff (no flags) shows differences between the working directory and the staging area — changes you've made but haven't staged. git diff --staged shows differences between the staging area and the last commit — changes that will be included in the next commit.

4. What does the -a flag do in git commit -am "msg"?

Answer
It automatically stages all modified tracked files before committing. It does NOT add untracked (new) files — those still require an explicit git add.

5. What is the 50/72 rule for commit messages?

Answer
The summary line (first line) should be at most 50 characters, in imperative mood. The body (after a blank line) should wrap at 72 characters per line. This ensures readability in terminals, log viewers, and email-based workflows.

6. You run git add file.py, then edit file.py again. What happens if you commit?

Answer
The commit will contain the version of file.py that existed when you ran git add, not the latest version. To include the new edits, you need to run git add file.py again before committing.

7. How does .gitignore work, and what's its limitation?

Answer
.gitignore tells Git to pretend certain files don't exist — they won't appear in git status and won't be staged by git add .. However, it only applies to untracked files. If a file is already tracked (previously committed), .gitignore has no effect on it. You must first untrack it with git rm --cached.

8. What's the difference between git init and git clone?

Answer
git init creates a new empty repository in the current directory — no commits, no remote. git clone <url> downloads an existing repository, sets up a remote called origin, and creates a local tracking branch.

9. What does git restore --staged <file> do?

Answer
It removes the file from the staging area (unstages it). The file's changes remain in the working directory — nothing is lost. It reverses a git add.

10. What does git commit --amend do, and when is it safe to use?

Answer
It replaces the last commit with a new one (incorporating any newly staged changes and/or a new message). The old commit gets a different hash. It's safe to use only on commits that haven't been pushed to a shared remote — amending a pushed commit rewrites history others may depend on.