Programming with Python
Basic data types in Python include integers, strings, and
floating-point numbers.
Use variable = value to assign a value to a variable in
order to record it in memory.
Variables are created on demand whenever a value is assigned to
them.
Use print(something) to display the value of
something.
Use # some kind of explanation to add comments to
programs.
Built-in functions are always available to use.
Import a library into a program using
import libraryname.
Use the pandas library to work with tabular data in
Python.
Use the read_csv function to load data into a dataframe
variable.
Use index_col to specify that a column’s values should
be used as row headings.
Use info to find out basic information about a
dataframe.
Use slices and loc to extract entries from a
dataframe.
The expression dataframe.shape gives the shape of the
underlying array.
Use label_a:label_c to specify a slice
that includes the rows or columns from label_a to, and
including, label_c.
Array indices start at 0, not 1.
Use low:high to specify a slice that
includes the indices from low to high-1.
Use # some kind of explanation to add comments to
programs.
Use the pyplot module from the matplotlib
library to create visualizations of data.
Dataframes have methods like min, max, and
mean to compute statistics along either the rows or the
columns.
Use axis argument in statistic functions to calculate
the values across the specified axis.
We can use add_subplot to create multiple plots in a
single figure.
We can customise the labels, axis ranges, line styles, and more of
our plots using matplotlib.
[value1, value2, value3, ...] creates a list.
Lists can contain any Python object, including lists (i.e., list of
lists).
Lists are indexed and sliced with square brackets (e.g., list\[0\] and list\[2:9\] ), in the same way as strings and
arrays.
Lists are mutable (i.e., their values can be changed in place).
Strings are immutable (i.e., the characters in them cannot be
changed).
Use for variable in sequence to process the elements of
a sequence one at a time.
The body of a for loop must be indented.
Use len(thing) to determine the length of something
that contains other values.
Use glob.glob(pattern) to create a list of files whose
names match a pattern.
Use * in a pattern to match zero or more characters,
and ? to match any single character.
Use if condition to start a conditional statement,
elif condition to provide additional tests, and
else to provide a default.
The bodies of the branches of conditional statements must be
indented.
Use == to test for equality.
X and Y is only true if both X and
Y are true.
X or Y is true if either X or
Y, or both, are true.
Zero, the empty string, and the empty list are considered false; all
other numbers, strings, and lists are considered true.
True and False represent truth
values.
Define a function using
def function_name(parameter).
The body of a function must be indented.
Call a function using function_name(value).
Variables defined within a function can only be seen and used within
the body of the function.
Variables created outside of any function are called global
variables.
Within a function, we can access global variables.
If we want to do the same calculation on all entries in our columns,
we can pass the dataframe columns as the inputs to a function.
Use help(thing) to view help for something.
Put docstrings in functions to provide help for that function.
Specify default values for parameters when defining a function using
name=value in the parameter list.
Parameters can be passed by matching based on name, by position, or
by omitting them (in which case the default value is used).
Put code whose parameters change frequently in a function, then call
it with different parameter values to customize its behavior.
Tracebacks can look intimidating, but they give us a lot of useful
information about what went wrong in our program, including where the
error occurred and what type of error it was.
An error having to do with the ‘grammar’ or syntax of the program is
called a SyntaxError. If the issue has to do with how the
code is indented, then it will be called an
IndentationError.
A NameError will occur when trying to use a variable
that does not exist. Possible causes are that a variable definition is
missing, a variable reference differs from its definition in spelling or
capitalization, or the code contains a string that is missing quotes
around it.
Containers like lists and strings will generate errors if you try to
access items in them that do not exist. This type of error is called an
IndexError.
Trying to read a file that does not exist will give you an
FileNotFoundError. Trying to read a file that is open for
writing, or writing to a file that is open for reading, will give you an
IOError.
Program defensively, i.e., assume that errors are going to arise,
and write code to detect them when they do.
Put assertions in programs to check their state as they run, and to
help readers understand how those programs are supposed to work.
Use preconditions to check that the inputs to a function are safe to
use.
Use postconditions to check that the output from a function is safe
to use.
Write tests before writing code in order to help determine exactly
what that code is supposed to do.
Know what code is supposed to do before trying to debug
it.
Make it fail every time.
Make it fail fast.
Change one thing at a time, and for a reason.
Keep track of what you’ve done.
Be humble.
The sys library connects a Python program to the system
it is running on.
The list sys.argv contains the command-line arguments
that a program was run with.
Avoid silent failures.
The pseudo-file sys.stdin connects to a program’s
standard input.