* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download EIN 4905/ESI 6912 Decision Support Systems Excel
Microsoft Jet Database Engine wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Clusterpoint wikipedia , lookup
Functional Database Model wikipedia , lookup
Spreadsheet-Based Decision Support Systems Chapter 21: Working with Large Data Using VBA Prof. Name Position University Name [email protected] (123) 456-7890 Overview 21.1 Introduction 21.2 Creating Pivot Tables with VBA 21.3 Using External Data 21.4 Exporting Data 21.5 Applications 21.6 Summary 2 Introduction Creating pivot tables using VBA Importing data from text files or webpages using VBA Importing data from databases Creating basic queries using the SQL programming language Exporting data using VBA An application which allows a user to query a database from Excel 3 Creating Pivot Tables with VBA In Chapter 6, we learned how to: – Create pivot tables and pivot charts using the Insert > Tables > Pivot Table command on the Ribbon – Filter large data by specifying row fields, column fields, report fields, and value fields – Create calculated fields, include sub totals and grand totals, and manipulate the pivot table options We will now learn the properties and methods in VBA that will allow us to perform these tasks dynamically. 4 Figure 21.1 In this example, we dealt with the shipping costs for varying maximum weights and days to arrive for two different shipping companies. 5 Figure 21.2 In the pivot table we created, the row fields were “Days to Arrive” and “Max Weight, lbs,” the column field was “Shipping Companies,” and the data field was “Cost.” We also added a calculated field for minimum costs. 6 Creating Pivot Tables with VBA (cont’d) The two main objects we use to create and manipulate pivot tables are the PivotCaches and PivotTables collection. PivotCaches is a collection of PivotCache objects on a Workbook object. A PivotCache is the memory cache where the data used by a pivot table is stored. PivotTables is a collection of pivot table objects on a particular Worksheet object. The first step in creating a pivot table is creating the corresponding cache. 7 PivotCaches We declare a PivotCache object variable by using the Dim statement. – Dim MyCache As PivotCache We use the Create() method to create a cache object. This method has three arguments: – SourceType: specifies if the data comes from the spreadsheet (xlDatabase), an external source (xlExternal), or multiple ranges (xlConsolidation). – SourceData: is used to specify the specific data from this source type. – Version: specifies the version of the pivot table. After creating the cache object, we use the Set statement to assign this object to MyCache variable. Set MyCache = ActiveWorkbook.PivotCaches.Create(xlDatabase,_ Worksheets(“Data_Shipping”).Range(“B3:E27”), xlPivotTableVersion14) 8 PivotTables We declare a PivotTable object variable by using the Dim statement. – Dim MyPT As PivotTable We use the Add method to create a new PivotTable object. The Add method has three arguments: – PivotCache: specifies the cache for the table. – TableDestination: specifies a range where the table should be placed. – TableName: is used to give a name to the table. After creating a new PivotTable object, we use the Set statement to assign this object to MyPT variable. – Set MyPT = ActiveSheet.PivotTables.Add(MyCache, Range(“G3”), “Pivot Table1”) 9 PivotFields We use the PivotField.Orientation property in order to set row fields, column fields, the data field, and the page field of the pivot table. This property takes the values xlDataField, xlRowField, xlColumnField, and xlPageField for the respective fields. With MyPT .PivotFields("Shipping Companies").Orientation = xlColumnField .PivotFields("Max Weight, lbs").Orientation = xlRowField .PivotFields("Days to Arrive").Orientation = xlRowField .PivotFields("Costs").Orientation = xlDataField End With One of the possible values for the Orientation property is xlHidden. This will hide all of the values of the specified field. 10 PivotItems We can to refer to the items in a particular field of the pivot table using the PivotItems method. For example to refer to item “4” in the field “Days to Arrive” we would type the following: – MyPT.PivotFields(“Days to Arrive”).PivotItems(“4”) 11 RefreshOnFileOpen Property The RefreshOnFileOpen is one PivotCache property that is used frequently. For example, to refresh the data on “Pivot Table1” located on worksheet “Data_Shipping” every time that the workbook is open, we set the value of this property to true by typing the following: – Worksheets(“Data_Shipping”).PivotTables(“PivotTable1”).PivotCashe. RefreshOnFileOpen = True 12 RowGrand and ColumnGrand Properties For the PivotTables object, there are several other properties and methods to discuss. The RowGrand and ColumnGrand properties specify whether or not grand totals should be calculated for row or column fields, respectively. – The possible values for these properties are True or False. For example, to remove the cost totals for the row fields “Max Weight, lbs” and “Days to Arrive,” we would type the following: – MyPT.RowGrand = False Similarly, we use MyPT.ColumnGrand = False in order to remove the cost total for the column field “Shipping Companies.” 13 RefreshTable Method A useful method of the PivotTables object is the RefreshTable method. – This method is equivalent to pressing the Refresh command icon on the PivotTable Tools tab. – If any changes are made to the data from which the pivot table was created, refreshing the data will update the pivot table data. MyPT.RefreshTable 14 GetPivotData Method One last useful method of the PivotTables object is GetPivotData. – This method has the same functionality as the GETPIVOTDATA function defined in Chapter 6. – For a specific item in a given row or column field, this method will find the corresponding value from the data field. GetPivotData(“DataFieldName”, “RoworColumnFieldName”, “ItemName”) If only one row or column field is given with a paired item name, then this method will return a grand or sub total value from the data field. However, if more than one row or column field is given, then the method will narrow down the search as much as possible to return the specific value from the data field. 15 Function Property There are two properties which can be used to make calculations (sum, average, min, max, etc). – These are the Function property and SubTotals property. The Function property is used for data fields. – To use this property simply specify the type of calculation you want to be made on the named field. MyPT.PivotFields(“Sum of Costs”).Function = xlMin 16 Subtotals Property The SubTotals property is used for non-data fields. – With this property you must specify an index number, or numbers, which represent the type of sub totals you want to show for the given field. – These index values are 2 = sum 3 = count 4 = average 5 = max 6 = min others MyPT.PivotFields(“Max Weight, lbs”).SubTotals(6) = True 17 Visible Property There is one main property that is used often with the PivotItems object which is the Visible property. – Using this property is similar to clicking on the drop-down list of values for a field in a pivot table and checking or un-checking the values which you want to be displayed. – The values for this property are True and False, much like we have seen in uses of the Visible property with other objects. With MyPT .PivotFields("Days To Arrive").PivotItems("1").Visible = True .PivotFields("Days To Arrive").PivotItems("8").Visible = True .PivotFields("Days To Arrive").PivotItems("2").Visible = False .PivotFields("Days To Arrive").PivotItems("4").Visible = False End With 18 ShowPivotTableFieldList Property One last useful property is the ShowPivotTableFieldList property which is used with a Workbook object. – This property has True or False values which can be set to show or hide the pivot table field list of the pivot tables in the workbook. ActiveWorkbook.ShowPivotTableFieldList = True 19 Figure 21.3 This is a complete procedure for creating the shipping costs pivot table. 20 Using External Data Importing Data – Text Files and Webpages – Databases Performing Queries with SQL 21 Importing Data We will first describe how to import data from text files and web addresses using VBA. We will use an object called QueryTables. This is a member of the Worksheet object. ActiveSheet.QueryTables 22 Importing Data (cont’d) To import data, we will simply add a QueryTable object using the Add method. The Add method has two arguments: – Connection requires the type of data being imported and the actual location of the data. – Destination argument is the location on the spreadsheet where you would like to place the imported data. 23 Connection Argument The Connection argument enables us to clarify if we are importing data from a text file or a webpage. If we are importing data from a text file, we would define the Connection argument as follows. Connection:= “TEXT; path” Here, the path is the actual location of the text file on your computer given by some string value. 24 Connection Argument (cont’d) The path value can also be given dynamically by prompting the user for the path value and storing the path name in a string variable. This path value would have to be concatenated with the TEXT specification. Dim UserPath As String UserPath = InputBox(“Enter path of text file.”) Connection:= “TEXT; “ & UserPath & “ 25 Connection Argument (cont’d) In creating dynamic imports, you may prefer to let the user browse for a file rather than enter the path. To display an explorer browse window, we use the GetOpenFilename method associated with the Application object. – This method presents the user with a browse window and allows them to select a file. – The name of the file is returned as a string value. Application.GetOpenFilename(FileFilter, FilterIndex, Title, ButtonText, MultiSelect) 26 Connection Argument (cont’d) The FileFilter argument gives you the option of limiting the type of file the user can select. – “Text Files (*.txt), *.txt” The Title argument allows you to give a title to the browse window that will appear. The MultiSelect has the values True or False to determine if a user can select more than one or only one value, respectively. Dim Dim UserPath As String UserPath = Application.GetOpenFilename(“Text Files (*.txt), *.txt”, , “Select a file to import.”, , False) Connection:= “TEXT; “ & UserPath & “ 27 Connection Argument (cont’d) If we are importing data from a webpage, we would define the Connection argument as follows: Connection:= “URL; actual URL” Here, the actual URL is the URL of the website. Again, this value could be taken from the user dynamically. 28 Destination Argument The Destination argument value is simply a range. Columns and rows will be created for the data appropriately. The output range for the entire table of data will begin in the Destination range. Destination:=Range(“A1”) 29 Importing Text Procedure The necessary properties for importing a text file basically describe how the text is organized in the file so that the values are imported correctly. With ActiveSheet.QueryTables.Add (Connection:=“TEXT;C:\MyDocuments\textfile.txt", Destination:=Range("A1")) .Name = "ImportTextFile" .FieldNames = True .RowNumbers = False .TextFileStartRow = 1 .TextFileParseType = xlDelimited .TextFileTextQualifier = xlTextQualifierDoubleQuote .TextFileCommaDelimiter = True .Refresh BackgroundQuery:=False End With 30 Figures 21.4, 21.5, and 21.6 An example text file imported to Excel using VBA 31 Importing Webpage Procedure To import a webpage, there are a few new properties needed. With ActiveSheet.QueryTables.Add (Connection:= "URL;http://www.webpage.com", Destination:=Range("C1")) .Name = “WebpageQuery1" .FieldNames = True .RowNumbers = False .WebSelectionType = xlSpecifiedTables .WebFormatting = xlWebFormattingNone .WebTables = "1" .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .Refresh BackgroundQuery:=False End With 32 Figure 21.7 Suppose we wish to import into Excel the most recent mortgage rates quoted by CNN Money. 33 Figures 21.8 and 21.9 The importing procedure and the imported data 34 Importing Databases There are two main systems used in VBA for communicating with databases as external data sources. Data Access Object (DAO):is used to import and manipulate data, primarily from databases. – To use DAO, we must first reference it in the VBE using the Tools > References menu options. ActiveX Data Objects (ADO): is used to import and manipulate data from databases. Both ADO and DAO use Object Database Connectivity (ODBC) to securely access data from databases. We have found that ADO objects are much simpler to use than DAO objects; therefore, we only discuss ADO in detail. 35 Importing Databases (cont’d) There are two main ADO objects used to import data – Connection – Recordset The Connection object establishes the communication to a particular database. There are two main methods used with this object. – Open method uses a ConnectionString argument to define the path to the database. – Close method does not have any arguments. A Connection should be opened and closed every time a query or import is made from the database. 36 Connection Object (cont’d) To define a Connection object variable, we use a data type called ADODB.Connection. We declare the variable as an ADODB.Connection data type and then use the Set statement to define the connection value of our variable. We define our connections to be new connections using the New statement. Dim cntMyConnection As ADODB.Connection Set cntMyConnection = New ADODB.Connection 37 Connection Object (cont’d) Now, we need to define the data provider, or database type, and data source, or filename, of this connection. – These values will be given to the ConnectionString argument of the Open method. – The data provider we will usually use can be defined as: “Microsoft.Jet.OLEDB.4.0”. – The data source should be the filename of the database plus the path of the file. Dim dbMyDatabase As String dbMyDatabase = ThisWorkbook.Path & “\MyDatabase.mdb” 38 Connection Object (cont’d) Now we have the data provider and data source; we can either assign these values directly to the ConnectionString argument or we can use a String variable. The ConnectionString argument value has two sub arguments named Provider and Data Source for the data provider and data source, respectively. Dim CnctSource As String CnctSource = “Provider=Microsoft.Jet.OLEDB.12.0; Data Source=” & dbMyDatabase & “;” 39 Connection Object (cont’d) The complete code to open a connection is: Dim cntMyConnection As ADODB.Connection, dbMyDatabase As String, CnctSource As String Set cntMyConnection = New ADODB.Connection dbMyDatabase = ThisWorkbook.Path & “\MyDatabase.mdb” CnctSource = “Provider=Microsoft.ACE.OLEDB.12.0; Data Source=” & dbMyDatabase & “;” cntMyConnection.Open ConnectionString:=CnctSource 40 Connection Object (cont’d) After closing a Connection, we clear the Connection value by setting it to Nothing. The complete code to close a connection is: cntMyConnection.Close Set cntMyConnection = Nothing 41 Recordset Object The Recordset object is used to define a particular selection of data from the database that we are importing or manipulating. – We will again use a variable to represent this object throughout the code; to define Recordset object variables, we use the ADODB.Recordset data type. – We again use the Set statement to assign the value to this variable as a New Recordset. Dim rstFirstRecordset As ADODB.Recordset Set rstFirstRecordset = New ADODB.Recordset 42 Recordset Object (cont’d) The arguments for the Open method of the Recordset object are – Source – ActiveConnection The Source argument defines the data that should be imported. – The Source value is a string which contains some SQL commands. – Similar to the data source value and ConnectionString value, we can use a String variable to define these SQL commands to use as the value of the Source argument Dim Src As String Src = “SELECT * FROM tblTable1” 43 Recordset Object (cont’d) The ActiveConnection argument value is the name of the open Connection object you have previously defined. rstFirstRecordset.Open Source:=Src; ActiveConnection:=cntMyConnection To copy this data to the Excel spreadsheet, we use the Range object and a new method: CopyFromRecordset. – This method only needs to be followed by the name of the Recordset variable you have just opened. Range(“A1”).CopyFromRecordset rstFirstRecordset 44 Recordset Object (cont’d) In each procedure where we are importing or manipulating data from a database, we type the following. Dim rstFirstRecordset As ADODB.Recordset, Src As String Set rstFirstRecordset = New ADODB.Recordset Src = “SELECT * FROM tblTable1” rstFirstRecordset.Open Source:=Src; ActiveConnection:=cntMyConnection Range(“A1”).CopyFromRecordset rstFirstRecordset 45 Recordset Object (cont’d) When we are done using this Recordset, we should clear its values; we do this using the Set statement with the value Nothing. Set rstFirstRecordset = Nothing 46 Importing Databases (cont’d) In applications where you plan to make multiple queries to a database, we recommend creating a function procedure which can be called for each query. Function QueryData(Src, OutputRange) dbUnivInfo = ThisWorkbook.Path & “\UniversityInformationSystem.mdb” Set cntStudConnection = New ADODB.Connection CnctSource = “Provider=Microsoft.ACE.OLEDB.12.0;Data Source=” & dbUnivInfo & “;” cntStudConnection.Open ConnectionString:=CnctSource Set rstNewQuery = New ADODB.Recordset rstNewQuery.Open Source:=Src, ActiveConnection:=cntStudConnection Range(OutputRange).CopyFromRecordset rstNewQuery Set rstNewQuery = Nothing cntStudConnection.Close Set cntStudConnection = Nothing End Function 47 Performing Queries with SQL Structured Query Language (SQL) is the code used to perform queries, or filter the data which is imported. SQL commands are used to define the Source argument of the Open method with the Recordset object. You can define the Source to be all values in a particular database table or pre-defined query or you can create a query as the value of the Source argument. 48 SQL (cont’d) The basic structure of SQL commands is: 1. A statement which specifies an action to perform. 2. A statement which specifies the location of the data on which to perform the action. 3. A statement which specifies the criteria the data must meet in order for the action to be performed. Some basic action statements are – SELECT – CREATE – INSERT 49 Figure 21.10 Consider a table from a University System database. This table, called tblStudents, contains student names, IDs, and GPAs. 50 SQL (cont’d) The SELECT statement selects a specific group of data items from a table or query in the database. The phrase appearing immediately after the SELECT statement is the name or names of the fields which should be selected. SELECT StudentName FROM tblStudents 51 SQL (cont’d) To select everything in a table, that is all fields, use the asterisks mark (*) after the SELECT statement. We must also specify the location of this field, that is the table or query title from the database. We do this using the FROM statement. 52 SQL (cont’d) We can also include a criteria filtering in the query. The most common criteria statement is WHERE. The WHERE statement can use sub statements such as – <, >, = for value evaluations. – BETWEEN, LIKE, AND, OR, and NOT for other comparisons. SELECT StudentName FROM tblStudents WHERE GPA > 3.5 53 SQL (cont’d) Other criteria statements include – GROUP BY – ORDER BY ORDER BY can be used with the WHERE statement to sort the selected data; this data can be sorted in ascending or descending order using the statements ASC or DESC respectively. SELECT StudentName, GPA FROM tblStudents WHERE GPA > 3.0 ORDER BY GPA DESC 54 SQL (cont’d) In a SELECT statement, we can also perform simple aggregate functions. Simply type the name of the function after the SELECT statement and list the field names which apply to the function statement in parenthesis. One common function statement is COUNT. – Using SELECT COUNT will return the number of items (matching any given criteria) instead of the items themselves. SELECT COUNT (StudentName) FROM tblStudents WHERE GPA > 3.5 55 SQL (cont’d) Other functions include – MIN – MAX – AVG SELECT AVG (GPA) FROM tblStudents 56 SQL (cont’d) In VBA, SQL statements always appear as a string; that is, they are enclosed by quotation marks. If your criteria checks for a particular string value, you must use single quotation marks to state that value. =”SELECT GPA FROM tblStudents WHERE StudentName = ‘John Doe’ ” 57 SQL (cont’d) Now suppose instead of specifying our own criteria, we want the user to determine which name to search for. – We can use an Input Box and a variable, in this example called StudName, to prompt the user for this value. – Then we can include this variable in place of the criteria value in the SQL statement. =”SELECT SSN FROM tblStudents WHERE StudentName = ‘” & StudName &” ’” Note that we have to include the single quotation marks around the criteria value; therefore, we have concatenated the variable name followed by the ending single quotation mark. 58 SQL (cont’d) Now let us incorporate these SQL statements into our database query code. – We will use a string variable to assign the value of the SQL commands. – We will then use this variable in the Source argument of the Open method of the Recordset object. Dim StudName As String StudName = InputBox(“Please enter name of student whose GPA you want.” Src = “SELECT GPA FROM tblStudents WHERE StudentName = ‘” & StudName & “’” rstFirstRecordset.Open Source:=Src; ActiveConnection:=cntMyConnection Range(“A1”).CopyFromRecordset rstFirstRecordset 59 Exporting Data We can also use SQL to export data. We can place data into a previously created Access database using the CREATE and INSERT SQL commands. The CREATE statement can be used to create a new table in the database. – The corresponding location statement for the CREATE command is TABLE. – The name of the new table is given after the TABLE statement. – The name of the table is followed by the name of the fields for the new table; these are listed in parenthesis with a description of the data type the field should hold. – You must also include a CONSTRAINT command to specify the primary key of the table. – You would give a name to this key, specify that it is the PRIMARY KEY, and then list the selected field. =“CREATE TABLE tblCourses (CourseName TEXT, CourseNumber NUMBER, FacultyAssigned TEXT) CONSTRAINT CourseID PRIMARY KEY (CourseNumber)” 60 Exporting Data (cont’d) Once you have created a table, you can use the INSERT statement to enter values for each field. The INSERT statement is always followed by the INTO location statement. – The name of the table into which you are entering values is listed after the INTO statement. – The field names for which you are entering values should then be listed in parenthesis; that is, you may not want to enter values for all fields. – Then the values are listed after a VALUES statement in the same order in which the corresponding fields were listed. =”INSERT INTO tblCourses (CourseName, CourseNumber, FacultyAssigned) VALUES (‘DSS’, 234, ‘J. Smith’)” 61 Exporting Data (cont’d) You can also use the UPDATE statement to change values in a previously created table. The UPDATE statement uses the SET location statement and the same criteria statements used with the SELECT command. =”UPDATE tblStudents SET GPA = 3.9 WHERE StudentName = ‘Y. Zaals’” 62 Applications Transcript Query – We will develop an application which performs dynamic database queries using a pre-developed Access database. 63 Description This database contains information on students, faculty, courses, sections, and grades; there are six tables and one query. In this application, we will allow the user to query the database to retrieve transcript data for a particular student. This transcript data will include every course the student has taken with the details of the course and section as well as the grade they earned. We will then evaluate all grades to calculate the selected student’s overall GPA. 64 Figure 21.11 The tables and queries from MS Access. 65 Figure 21.12 The transcript query shows the course and grade information for a selected student. 66 Figure 21.13 The query function procedure. 67 Figure 21.14 The Main and CreateStudentList procedures. 68 Figure 21.15 The table “tblStudent” contains the list of the names of each student. 69 Figures 21.16 and 21.17 The form contains a combo box with a list of all students in the database. In the code for the form, we use the Initialize event of the UserForm to set the RowSource to “StudList”. 70 Figure 21.18 The combo box list of students is the most current list from the database. 71 Figure 21.19 The query “qryCourseID” has the course ID and section number for all the courses each student has taken. 72 Figure 21.20 The “tblCourse” table from Access. 73 Figure 21.21 The “tblSection” table from Access. 74 Figure 21.22 The transcript query code. 75 Application Conclusion The application is now complete. Transcript queries can be made for any student selected from the form. 76 Summary The two main objects we use to create a pivot table are PivotChaches and PivotTables. We must use the ActiveSheet object before specifying a PivotTables object. To create a pivot chart in VBA simply use the Chart object. There are two main systems used in VBA for communicating with external data sources: DAO and ADO. (We use ADO in this chapter.) There are two main ADO objects used to import data: Connection and Recordset. Structured Query Language (SQL) is the code used to perform queries or filter the data which is imported. The basic structure of SQL commands is as follows: an action to perform, the location of the data on which to perform the action, and the criteria the data must meet in order for the action to be performed. Variables can be used to make queries dynamic with Input Boxes, User Forms, or by simply taking values the user has entered in a spreadsheet. 77 Additional Links (place links here) 78