{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "240f4b13",
   "metadata": {},
   "source": [
    "# Collection Types\n",
    "\n",
    "Python’s [data structure](https://docs.python.org/3/tutorial/datastructures.html) are the built‑in **`container`** types you use to store, organize, and operate on groups of items/values. The four main data structure types are: `list`, `tuple`, `dictionary`, and `set`. \n",
    "\n",
    "The commonly used **built-in**/**standard** data structures can be grouped as: \n",
    "- Sequence Types\n",
    "- Set Types\n",
    "- Mapping Types\n",
    "\n",
    "```\n",
    "Collections (broad category)\n",
    "├── Sequence Types (ordered, indexed)\n",
    "│   ├── list\n",
    "│   ├── tuple\n",
    "│   ├── str (string)\n",
    "│   ├── range\n",
    "│   └── bytes/bytearray\n",
    "│\n",
    "├── Set Types (unordered, no duplicates)\n",
    "│   ├── set\n",
    "│   └── frozenset\n",
    "│\n",
    "└── Mapping Types (key-value pairs)\n",
    "    └── dict\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2349522e",
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "from pathlib import Path\n",
    "\n",
    "# Find project root by looking for _config.yml\n",
    "current = Path.cwd()\n",
    "for parent in [current, *current.parents]:\n",
    "    if (parent / '_config.yml').exists():\n",
    "        project_root = parent\n",
    "        break\n",
    "else:\n",
    "    project_root = Path.cwd().parent.parent\n",
    "\n",
    "# Add project root to path\n",
    "sys.path.insert(0, str(project_root))\n",
    "\n",
    "# Import shared teaching helpers and cell magics\n",
    "from shared import thinkpython, diagram, jupyturtle, structshape\n",
    "from shared.download import download\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da204028",
   "metadata": {},
   "source": [
    "Among them, `list`, `tuple`, and `range` are [sequence types](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range). Also, `strings` are considered a sequence type (a kind of collection) because they behave like collections of characters, although conceptually, in many languages other than Python, strings are considered primitives, not collections.\n",
    "\n",
    "`range` is debatable as a “collection”; it’s really a lazy sequence generator, not a traditional data structure. It does look like a sequence, so Python treats it as one. Note that `range` stores a formula, not data; it computes values on demand. Therefore, `range` is a sequence protocol implementer, not a data container.\n",
    "\n",
    "A `set` object is an \"unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference\" [cite:p`Python Standard Library_2026`].\n",
    "\n",
    "Mapping in Python refers to data types that store **key-value** pairs, where each key is associated with a corresponding value. The most common mapping type is the dictionary (`dict`).\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3227525f",
   "metadata": {},
   "source": [
    "## Mutability\n",
    "\n",
    "Mutability and order are two important characteristics of Python data structures, summarized in the table below. \n",
    "\n",
    "| Type | Literal | `Mutable` | Ordered | Usage |\n",
    "|------|---------|---------|---------|-------|\n",
    "| **Sequence Types** | | | | |\n",
    "| list | `[1, 2, 3]` | **Yes** | Yes | General purpose; dynamic arrays; index/slice access. |\n",
    "| tuple | `(1, 2, 3)` | No | Yes | Fixed records; function returns; hashable if elements are.* |\n",
    "| range | `range(10)` | No | Yes | Memory-efficient integer sequences; iteration. |\n",
    "| str | `\"hi\"` | No | Yes | Text; immutable character sequences. |\n",
    "| **Set Types** | | | | |\n",
    "| set | `{1, 2, 3}` | **Yes** | No | Unique items; membership testing; set operations. |\n",
    "| frozenset | `frozenset({1,2})` | No | No | Immutable set; dict keys; set elements. |\n",
    "| **Mapping Types** | | | | |\n",
    "| dict | `{\"a\": 1, \"b\": 2}` | **Yes** | Yes** | Key-value lookups; counting; grouping; configuration. |\n",
    "\n",
    "**Notes:**\n",
    "- \\* Tuples are hashable only if all elements are hashable.\n",
    "- \\*\\* Dict ordering guaranteed in Python 3.7+.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "62dfd965",
   "metadata": {},
   "source": [
    "## Lists (Ordered Collections)\n",
    "\n",
    "Lists store **multiple** items in **order** and are **mutable** (can be changed):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "eea52390",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Numbers: [1, 2, 3, 4, 5]\n",
      "Fruits: ['Apple', 'Banana', 'Cherry']\n",
      "Mixed: [1, 'hello', 3.14, True]\n",
      "Empty list: []\n"
     ]
    }
   ],
   "source": [
    "numbers = [1, 2, 3, 4, 5]\n",
    "fruits = [\"Apple\", \"Banana\", \"Cherry\"]\n",
    "mixed = [1, \"hello\", 3.14, True]\n",
    "empty_list = []\n",
    "\n",
    "print(f\"Numbers: {numbers}\")\n",
    "print(f\"Fruits: {fruits}\")\n",
    "print(f\"Mixed: {mixed}\")\n",
    "print(f\"Empty list: {empty_list}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "52ee45c0",
   "metadata": {},
   "source": [
    "Common list operations:\n",
    "1. Append: `numbers.append(6)` → `[1, 2, 3, 4, 5, 6]`\n",
    "2. **Access/Indexing**: `numbers[0]` → `1`\n",
    "3. **Slicing**: `numbers[1:3]` → `[2, 3]`\n",
    "4. Length: `len(numbers)` → `5`\n",
    "5. Update: `numbers[0] = 99` → `[99, 2, 3, 4, 5, 6]`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "89fd3adf",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Original list: ['apple', 'banana', 'cherry']\n",
      "After append: ['apple', 'banana', 'cherry', 'date']\n",
      "First fruit: apple\n",
      "Number of fruits: 4\n"
     ]
    }
   ],
   "source": [
    "fruits = [\"apple\", \"banana\", \"cherry\"]        ### create a list by assignment\n",
    "print(f\"\\nOriginal list: {fruits}\")\n",
    "\n",
    "fruits.append(\"date\")                         ### mutable: update/change the list\n",
    "print(f\"After append: {fruits}\")\n",
    "\n",
    "print(f\"First fruit: {fruits[0]}\")            ### indexing\n",
    "print(f\"Number of fruits: {len(fruits)}\")     ### length"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "26935c03",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "After modification: ['apple', 'banana', 'cranberry', 'date']\n"
     ]
    }
   ],
   "source": [
    "fruits[2] = \"cranberry\"                     ### lists are mutable: modify an element\n",
    "print(f\"After modification: {fruits}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d9acd512",
   "metadata": {},
   "source": [
    "## Tuples (Immutable Sequences)\n",
    "\n",
    "Tuples are similar to lists, but they cannot be changed after creation (**immutable**). So, use tuples when you want to ensure data won't change:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "56188e72",
   "metadata": {},
   "source": [
    "rgb_color = (255, 128, 0)\n",
    "single = (42,)             # Note the comma for single-item tuple"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "e839f7cb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Converted to tuple: ('apple', 'banana', 'cranberry', 'date')\n",
      "first element in coordinate:  apple\n"
     ]
    }
   ],
   "source": [
    "fruits = tuple(fruits)\n",
    "print(f\"\\nConverted to tuple: {fruits}\")            ### ( )\n",
    "print(\"first element in coordinate: \", fruits[0])   ### indexing"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b38bfae4",
   "metadata": {},
   "source": [
    "Since tuples are immutable, the following operation would generate an error:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "83fbefd2",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%expect TypeError\n",
    "fruits[2] = \"cherry\"  # This would cause an error because tuples are immutable!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac1100c8",
   "metadata": {},
   "source": [
    "## Dictionaries (Key-Value Pairs)\n",
    "\n",
    "In Python, a mapping type is a collection that stores data as `key–value` pairs, where each key is unique and maps to a corresponding value. The most common mapping type is the **`dictionary`** (`dict`), which allows flexible data organization and fast lookup, insertion, and modification of values by key rather than numerical index. An example of a Python dictionary: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "95f0aadb",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'name': 'Alice', 'age': 20, 'major': 'IST'}"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "student = {\n",
    "    \"name\": \"Alice\",\n",
    "    \"age\": 20,\n",
    "    \"major\": \"IST\"\n",
    "}\n",
    "student"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "32c6ea00",
   "metadata": {},
   "source": [
    "Common dictionary operations:\n",
    "- **Access**: `student[\"name\"]` → `\"Alice\"`\n",
    "- Add/Update: `student[\"gpa\"] = 3.8`\n",
    "- **Keys**: `student.keys()` → `dict_keys(['name', 'age', 'major'])`\n",
    "- **Values**: `student.values()` → `dict_values(['Alice', 20, 'Computer Science'])`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "b3a8424d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Student Name: Alice\n",
      "student: {'name': 'Alice', 'age': 20, 'major': 'CS', 'GPA': 3.5}\n",
      "dict_keys(['name', 'age', 'major', 'GPA'])\n",
      "dict_values(['Alice', 20, 'CS', 3.5])\n"
     ]
    }
   ],
   "source": [
    "print(f\"Student Name: {student['name']}\")   ### 1. access value by key\n",
    "student[\"major\"] = \"CS\"                     ### 2. update: MUTABLE\n",
    "student[\"GPA\"] = 3.5                        ### 2. add new key-value pair: MUTABLE\n",
    "print(f\"student: {student}\")\n",
    "\n",
    "print(student.keys())                       ### 3. keys\n",
    "print(student.values())                     ### 4. values"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1aa30c31",
   "metadata": {},
   "source": [
    "## Sets\n",
    "\n",
    "A **set** is an unordered collection of unique elements. Sets are useful for removing duplicates and performing mathematical set operations.\n",
    "\n",
    "Key characteristics:\n",
    "- **Unordered**: Elements have no defined order\n",
    "- **Unique**: Automatically removes duplicate values\n",
    "- **Mutable**: Can add/remove elements\n",
    "- **No indexing**: Cannot access elements by position"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "cea15073",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Set: {1, 2, 3, 4, 5}\n",
      "Unique values: {8, 9, 2, 5}\n",
      "From list: {24.1, 25.0, 23.5}\n"
     ]
    }
   ],
   "source": [
    "# Creating sets\n",
    "unique_numbers = {1, 2, 3, 4, 5}\n",
    "print(f\"Set: {unique_numbers}\")\n",
    "\n",
    "# Duplicates are automatically removed\n",
    "data = {5, 2, 8, 2, 5, 9}\n",
    "print(f\"Unique values: {data}\")\n",
    "\n",
    "# Creating from a list\n",
    "measurements = [23.5, 24.1, 23.5, 25.0]\n",
    "unique = set(measurements)\n",
    "print(f\"From list: {unique}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "5291d953",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Union: {1, 2, 3, 4, 5}\n",
      "Intersection: {3}\n",
      "Difference: {1, 2}\n"
     ]
    }
   ],
   "source": [
    "# Basic set operations\n",
    "set_a = {1, 2, 3}\n",
    "set_b = {3, 4, 5}\n",
    "\n",
    "print(f\"Union: {set_a | set_b}\")          # All elements\n",
    "print(f\"Intersection: {set_a & set_b}\")   # Common elements\n",
    "print(f\"Difference: {set_a - set_b}\")     # In A but not B"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Tags",
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
