The Third Lab

Install and Learn PostgreSQL (I)

Sepy 18, 2015

(I) Install PostgreSQL on Your Own Machine

  •         (1) Download PostgreSQL Here
                    Choose your operating system in "Binary packages".
                    Install the Version 9.2.13 ( The PostgreSQL version on CS Server is 9.2.13).
  •         (2) Install PostgreSQL.
                    Leave Port as 5432 .
                    Remember the password created during the installation.
  •         (3) Install Stack Builder (Automatically start after installing PostgreSQL)
                    Install "Database Drivers".
                            JDBC for Java Programming.
                            ODBC and OLE DB for C, C++, C# Programming.
                    Install "Web Development".
                            Leave Apache Port as 8080 .


(II) Open PostgreSQL on Your Own Machine

  •         (1) Open pgAdmin III .
  •         (2) Right click PostgreSQL 9.2 (The Default Server Name) and Connect.
  •         (3) Open shell to use PostgreSQL database. (Two Methods)
                    (a) Open SQL Shell program.
                    (b) Open a Terminal (Terminal on Mac and Linux, CMD on Windows).
                            (b1) Firstly, go to PostgreSQL bin folder, for example, "\...\PostgreSQL\9.2\bin"
                            (b2) Secondly, type psql -U [username] to enter shell in terminal.
  •         (4) Type \q to exit shell.
  •         (5) Right click PostgreSQL 9.2 and Disconnect server.


Either follow (I) & (II), if you are using your own machine.
Or follow (III), if you plan to use the server in CS department.


(III) Use PostgreSQL on CS460 Server

  •         (1) Connect to cs460.bu.edu server first.
                    (a) For the Linux and Mac users using BU network:
                            Open Terminal .
                            Type SSH [username]@cs460.bu.edu to log on the database server.
                    (b) For the Linux and Mac users outside the BU network:
                            Open Terminal .
                            Type SSH [username]@csa2.bu.edu and log on the csa2 server first.
                            Then type SSH [username]@cs460.bu.edu and log on to the database server.
                    (c) For the Windows users using BU network:
                            Open PuTTY .
                            Use [username]@cs460.bu.edu to log on the database server.
                    (d) For the Windows users outside the BU network:
                            Open PuTTY .
                            Use [username]@csa2.bu.edu to log on the csa2 server first.
                            Then type SSH [username]@cs460.bu.edu and log on to the database server.
  •         (2) After logging on cs460 server, type psql -U [username] to enter database shell.
  •         (3) Type \q to exit database shell.


(IV) Import Data to PostgreSQL

  •         (1) Download Hubway Dataset (Click 'Download the Data' Button at the Right of the Webpage).
                    (a) The file name should be hubway_2011_07_through_2013_11.zip
                    (b) Unzip the compressed data file.
                    (c) There are two datasets in the zip file: hubway_stations.csv and hubway_trips.csv
  •         (4) Importing Steps:
                    (a) Read the header (first line) of data file.
                    (b) Design the schema of the table.(Data Types)
                    (c) Enter database shell
                    (d) Create a table in database
                    (e) Copy data from the data file to the table created
                    (f) Verify data in the table
  •         (3) Follwing is an example of importing data from a csv file to a table in PostgreSQL database.


(V) Import Stations as an Example

  •         (1) Read the header (first line) of hubway_stations.csv data file. There are 7 attributes.
  •         (2) Decide the data type for each attribute: Using integer, char, varchar, decimal.
  •         (3) Create a station table in PostgreSQL. Use the following statement in PostgreSQL shell:
                    CREATE TABLE stations (
                        id integer,
                        terminal char(6),
                        station varchar(120),
                        municipal varchar(20),
                        lat decimal(8,6),
                        lng decimal(8,6),
                        status varchar(20)
                    );
  •         (4) Type \d stations to check the schema of table "stations"
  •         (5) Use \copy to copy the data from the CSV file to the table "stations" as follow:
                    \copy stations from '[path]/hubway_stations.csv' CSV HEADER;

                    [path] is the path where the file hubway_stations.csv is stored, for example ~/Downloads
  •         (6) Use SELECT COUNT(*) FROM stations; to check number of records.
  •         (7) Use SELECT * FROM stations LIMIT 10; to check first 10 records.


(VI) Exercise: Import Trips

  •         You need to learn the following things:
  •         (1) Data Types in PostgreSQL: Integer, Char, Varchar, Decimal, Timestamp, Text
  •         (2) CREATE TABLE command in PostgreSQL.
  •         (3) COPY command in PostgreSQL.

  •         (*) If you are using PostgreSQL on cs460.bu.edu server. DO NOT import the whole trip file into the table!
  •         (**) Use head -10000 hubway_trips.csv >> hubway_trips_sample.csv to get a sample file first. Then import the sample file into the table only.