Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Matlab Data Structures Dr. Antonio A. Trani Professor Dept. of Civil and Environmental Engineering Virginia Tech (copyright A.A. Trani) Why Learn Data Structures? • Engineers need to manipulate large amounts of data • • Data sometimes comes in a variety of formats • Matlab has two important structures that you should be familiar with: Data is both numeric and character or “string” data • • Struct arrays Cell arrays Virginia Tech (copyright A.A. Trani) 2 Recall: Reading Data Files Using the Textcan Command • • Using the Textscan Command Here is a sample script to read a text file containing data on bridges of the world fid = fopen(‘bridges_of_the_world_short’) readHeader = textscan(fid, ‘%s’, 4, ‘delimiter’, ‘|’); readData = textscan(fid, '%s %s %f %f'); fclose(fid); Virginia Tech (copyright A.A. Trani) 56a Data File (bridges_of_the_world) Name | Country | Completed | Length (m) Mackinac United-States 1957 8038 Xiasha China 1991 8230 Virginia-Dare-Memorial United-States 2002 8369 General-Rafael-Urdaneta Venezuela 1962 8678 Sunshine-Skyway United-States 1987 8851 Twin-Span United-States 1960 8851 Wuhu-Yangtze-River China 2000 10020 Third-Mainland Nigeria 1991 10500 Seven-Mile United-States 1982 10887 San-Mateo-Hayward United-States 1967 11265 Leziria-Bridge Portugal 2007 11670 Confederation Canada 1997 12900 Rio-Niterol Brazil 1974 13290 Kam-Sheung Hong Kong 2003 13400 Penang Malaysia 1985 13500 Vasco-da-Gama Portugal 1998 17185 Bonnet-Carre-Spillway United-States 1960 17702 Chesapeake-Bay-Bridge-Tunnel United-States 1964 24140 Tianjin-Binhai China 2003 25800 Atchafalaya-Swamp-Freeway United-States 1973 29290 Donghai China 2005 32500 Manchac-Swamp United-States 1970 36710 Lake-Pontchartrain-Causeway United-States 1956 38422 Virginia Tech (copyright A.A. Trani) Header Data 56b Explanations of the Matlab Script fid = fopen(‘bridges_of_the_world_short’) • • fid - file ID assigned by Matlab fopen - “opens” (or reads) the text file called ‘bridges_of_the_world’ readHeader = textscan(fid, ‘%s’, 4, ‘delimiter’, ‘|’); • variable readHeader will store the contents of the first row in the file (‘bridges_of_the_world’) • textscan reads the first row of the file using ‘%s’,4 (four string variables) with ‘delimiter’ = ‘|’ Name | Country | Completed | Length (m) Mackinac United-States 1957 8038 Xiasha China 1991 8230 Virginia-Dare-Memorial United-States 2002 Virginia Tech (copyright A.A. Trani) 8369 56c Explanations of the Matlab Script readData = textscan(fid, '%s %s %f %f'); • variable readData will store the contents of the information starting in the second row (until the end) in the file (‘bridges_of_the_world’) • textscan reads the row data using ‘%s %s’ two string variables and two ‘%f %f’ numerical variables (f stands for floating point) fclose(fid); • fclose(fid) closes the file (fid) opened at the beginning of the script Name | Country | Completed | Length (m) Mackinac United-States 1957 8038 Xiasha China 1991 8230 Virginia-Dare-Memorial United-States 2002 Virginia Tech (copyright A.A. Trani) 8369 56d What is Produced by the Matlab Script? • • Four variables (2 are temporary - ans and fid) • Both variables are cell arrays Two variables with the information in the file (readHeader and readData) Virginia Tech (copyright A.A. Trani) 56e What is a Cell Array? • A special structure in Matlab to store dissimilar data types (i.e., strings and numeric data) >> readData readData = {14x1 cell} {14x1 cell} [13x1 double] [13x1 double] Bridge Name Country Year Completed Length (m) Virginia Tech (copyright A.A. Trani) 56f Addressing the Contents of a Cell Array • Cell arrays are referenced using curly brackets (first) then using standard brackets - to address individual elements of the cell array • readData{1} references the first column of the array Virginia Tech (copyright A.A. Trani) 56g Addressing the Contents of a Cell Array • Cell arrays are referenced using curly brackets (first) then using standard brackets - to address individual elements of the cell array • readData{1}(3,1) references the third row element of the cell array Virginia Tech (copyright A.A. Trani) 56h Addressing the Contents of a Cell Array • Cell arrays are referenced using curly brackets (first) then using standard brackets - to address individual elements of the cell array • readData{1}(3:5,1) references the third, fourth and fifth row elements of the cell array Virginia Tech (copyright A.A. Trani) 56i Addressing the Contents of a Cell Array • Cell arrays are referenced using curly brackets (first) then using standard brackets - to address individual elements of the cell array • readData{3} references all the elements of the third column of the cell array Virginia Tech (copyright A.A. Trani) 56j Addressing the Contents of a Cell Array • Cell arrays are referenced using curly brackets (first) then using standard brackets - to address individual elements of the cell array • readData{3}(1:5,1) references the first five row elements of the third column of the cell array Virginia Tech (copyright A.A. Trani) 56k Manipulating Data inside Cell Arrays • Now that we have the data try a few things: • Question 1: Suppose we want to know how many of the bridges of the World happen to be in the United States • Question 2: Suppose that we wanted to know the average bridge length of bridges in China Virginia Tech (copyright A.A. Trani) 14 Question 1: Suppose we want to know how many of the bridges of the World happen to be in the United States • Use the string comparison function in Matlab strcmp • Function that compares a string with an array of strings (or cell array) and outputs the position of the array where a match occurs strcmp(readData{2},'United-States') • Here we compare the elements of cell array readData{2} with the string “United-States’ Virginia Tech (copyright A.A. Trani) 15 Question 1: Suppose we want to know how many of the bridges of the World happen to be in the United States Original data readData{2} Elements of readData{2} that match the word ‘United-States” are assigned a one, otherwise a zero Virginia Tech (copyright A.A. Trani) 16 Question 1: Suppose we want to know how many of the bridges of the World happen to be in the United States String to Match Variable that contains indices of array that match String Comparison Function in Matlab Virginia Tech (copyright A.A. Trani) Array that we want to match 17 Question 1: Suppose we want to know how many of the bridges of the World happen to be in the United States To get the number of bridges we just sum the instances variable ‘Matches” For this example, there are 11 bridges that are listed under United-States Virginia Tech (copyright A.A. Trani) 18 Question 2: Suppose that we wanted to know the average bridge length of bridges in China • This tells us the number of bridges in China Variable “chineseBridges” contains the indices of bridges located in China (partially shown) Virginia Tech (copyright A.A. Trani) 19 Question 2: Suppose that we wanted to know the average bridge length of bridges in China • Create a new variable “lenghtOfbridgesInChina” to extract the lengths of all Chinese Bridges Variable “lengthOfbridgesInChina” contains the lengths of the bridges in China found in the list (partially shown) Virginia Tech (copyright A.A. Trani) 20 Observe What is Going On • The array (or matrix) “chineseBridges” is an index matrix with zeros or ones • The array “chineseBridges” acts as a pointer variable for other computations (a variable used to designate the positions of interest of the original array) • To add the lengths of the bridges in China then just add the lengths of the individual bridges found in previous step Virginia Tech (copyright A.A. Trani) 21 Question 2: Suppose that we wanted to know the average bridge length of bridges in China • Create a new variable “totalLenghtOfBridgesInChina” to estimate the total length of bridges in China A total of 76,550 meters of bridges are found in China The average bridge length is then 17,138 meters Virginia Tech (copyright A.A. Trani) 22 Try the Following in Class • • Find the number of bridges in Portugal • Find the oldest bridge in Portugal Find the number of miles of bridges in Portugal Virginia Tech (copyright A.A. Trani) 23 Matlab Structured Arrays Virginia Tech (copyright A.A. Trani) 24 What is a Structured Array? • Another Matlab data structure that can store dissimilar information (i.e., strings, numerics) • A structured file (or struct file) is defined a parent-child relationship • The parent structure contains high-level information about the data set • “Children” branches contain detail information related to the parent Parent Child 1 Child II Virginia Tech (copyright A.A. Trani) Child III 25 A Sample Structured Array • An engineer wants to store data about various airports in a Matlab struct file • • The data structure is shown below “Airport” has “children” (like name, type) and some children have “grand children” (i.e., location) Airport Name Type Location Latitude Longitude Virginia Tech (copyright A.A. Trani) Elevation 26 Sample Struct Array (Airport) • The struct file “airport” has 3 instances (i.e., records) • Each instance represents a record of a distinct airport Matlab tells you airport is a struct array Matlab tells you that there are three children Virginia Tech (copyright A.A. Trani) 27 Inspecting the Contents of Struct Array Airport • Instances of the struct array are called directly like any other array in Matlab • Type “airport(1)” at the command line to query the first instance of struct array airport Matlab tells you more about the data structure inside each child of airport Virginia Tech (copyright A.A. Trani) 28 Inspecting the Contents of Struct Array Airport • Check the contents of airport(1).name Note: Children are referenced by the parent.child notation • Check the contents of airport(1).type Virginia Tech (copyright A.A. Trani) 29 Inspecting the Contents of Struct Array Airport • Check the contents of airport(1).location The “child” called “location” has 3 numeric pieces of data: Latitude (degrees) Longitude (degrees) Elevation (feet) Virginia Tech (copyright A.A. Trani) 30 Usefulness of Struct Arrays • • • Fast to retrieve since they are .mat files • Can manipulate to do complex operations or to plot data (shown next) Organize data in a logical and convenient way Can contain dissimilar structures (i.e., numeric and string data) Virginia Tech (copyright A.A. Trani) 31 Manipulating Struct Array “Airport” • Task: retrieve the locations of the airports contained in airport struct array and plot their locations in a map • Plot : airport(i).location.longitude vs airport (i).location.latitude • We use a simple map contained in a binary file called “usamap” • Both files (airport.mat) and (usamap.mat) are available in the syllabus page Virginia Tech (copyright A.A. Trani) 32 Matlab Code Virginia Tech (copyright A.A. Trani) 33 Output of the Simple Matlab Code • A simple map is generated after the script executes • Note the locations of the airports Virginia Tech (copyright A.A. Trani) 34 To Do in Class • • Add x and y labels to the code • Try adding another airport to the data base: Change the font size to make them better looking • • • • • airport(4).name = ‘San Francisco’ airport(4).type = ‘Public’ airport(4).location.latitude = 37.62 airport(4).location.longitude = 122.38 airport(4).location.elevation = 13 Virginia Tech (copyright A.A. Trani) 35 Reading Excel Data Files with Matlab • • Using the xlsread Command Here is a sample script to read a data file containing data on bridges of the world [num,txt,raw] = xlsread ('bridges_of_the_world_short.xls','Bridge data'); • Reads the Excel worksheet named ‘Bridge data’ contained in file called 'bridges_of_the_world_short.xls' • • • Assigns all numeric data to variable ‘num’ Assigns all text data to variable called ‘txt’ All other unassigned data is stored in variable ‘raw’ Virginia Tech (copyright A.A. Trani) 56l Review of Reading Data from Excel • Now that you know how to define cell and struct arrays you can easily read data from Excel and store in any of the two learned Matlab structures • The point is that many times is easier to do complex tasks in Matlab after you read an Excel file Virginia Tech (copyright A.A. Trani) 37 Excel File to be Read Bridges_of_the_world_short.xls Virginia Tech (copyright A.A. Trani) 38 What Happens after Executing the One Line Script? • • • Three arrays are created using the previous script Array ‘num” is a standard matrix with size (23 x 2) Arrays ‘raw’ and ‘txt’ are cell arrays (24 x 4) each Virginia Tech (copyright A.A. Trani) 39 Observations • ‘num’ is a standard numeric array as shown • Elements of ‘num” can be referenced in the usual (row,column) format • num(2,2)=8230 Virginia Tech (copyright A.A. Trani) 40 Observations (2) • ‘txt’ is a cell array containing string data as shown • Elements of ‘txt” can be referenced using the cell array nomenclature cell{i}(row,column) • txt{1,2}=Country Virginia Tech (copyright A.A. Trani) 41 • Note Differences in How Cell Arrays Store Information In previous case, a cell array storing numerical data can be referenced • readData{3}(1:5,1) • In this last case, the cell array contains string information • txt{1}(1,2)=N Virginia Tech (copyright A.A. Trani) 42 Matlab xlsread can Read a Range in an Excel • • • • The Matlab statement: [num,txt,raw] = xlsread ('bridges_of_the_world_short.xls','Bridge data (A2:D24)'); Reads the Excel file but only across the range specified (A2:D24) This is useful if you know the data structure of the file you are reading Virginia Tech (copyright A.A. Trani) 43