Download From Analyst BI Day to Developer BI Night

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Functional Database Model wikipedia , lookup

Team Foundation Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

SQL wikipedia , lookup

Database model wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

PL/SQL wikipedia , lookup

Transcript
From Analyst BI Day to
Developer BI Night
Chris Voss
March 2017
 Data Management Analyst at McKesson Specialty Health
 Has worked with data for 8 years, including SQL Server since 2008
R2
About Me
 Pursuing MBA (Decision Analytics concentration) at North
Carolina State University
 Avid runner and autism spectrum advocate
 Lover of obscure pop culture references
The types of
analysts…what
are you?
 Three jobs and six years ago…
I was in this
position too
 I knew how to do some SQL statements from my college years and
internship period
 I was asked to support a process which moved data from an Excel
file to one of our databases
 A new table was necessary for a related project
 “Oh hey, we want to build an ad hoc instance on our new BI tool”
 Next thing you know, I was fully ingrained in the SQL Server world
A common
foundation
comparison
SQL Analyst
SQL Developer
 Query and process data into
reports
 Design and support of the
database (ideally, data
warehouse)
 Summarize and visualize
 Provide explanations to
stakeholders by extracting
value
 Understanding
administration fundamentals
 Concentrates of flow of data
from source to destination
Some new code basics
All because the DBA granted you more than READ access
 INSERT
Commands of
the WRITE
permission
 UPDATE
 DELETE
 CREATE
 ALTER
Different
character
types
 Have you tried to select information from a query, and instead got
this message?
Conversion failed when converting the varchar value
'72' to data type int.
 Character types become more important
 Numeric (int, bigint, smallint)
 Date (datetime, date)
 CAST and CONVERT when necessary
 A new habit: fully qualifying your objects
(database.schema.objectname) instead of just using the table or
view name
 Use of indexes on big tables (more than 1000 rows)
Objects
 Avoiding the table scans
 Good is an index scan, better is an index seek
 You’ll get an idea of what can slow down your queries
 LIKE qualifiers
 ORDER BY
New tools
You’ll get to use more than Office and Management Studio
SQL Server Integration
Services
SQL Server
Data Tools in
Visual Studio,
your new best
friend
…and SQL Server Reporting
Services
Integration:
Extract
Transform
Load
How the data gets in, and how
the data comes out. You get to
see why some data is very
messy and can figure out ways
to clean it up.
 Some of you may be building reports or making changes by
downloading from Report Manager
Reporting:
Data
Visualization
 Now you get to build these using the data sources
 You may have experience with Tableau, PowerBI, or another
visualization tool
 Start creating data models to organize the data for your dashboards
or reports
New concepts
There’s a method to the design madness: REFERENTIAL
INTEGRITY
 OLTP = transactional
 Use of large numbers of
writing transactions
 OLAP = analytical
 Use of lower transactions
and more complex queries
Did you keep
hearing OLTP
vs. OLAP?
 Data Warehouse: Integrating
data from multiple varied
sources to support analytical
reporting and decision
making
 The two types of schemas
 Star
 Snowflake
 Normalization
 Making it to at least Third
Normal Form (3NF)
PAUSE…
What is normal
form?
ID
AlbumName
ArtistName
Year
Genre_1
1
Sister
Sonic Youth
1987
Noise Rock
2
OK Computer
Radiohead
1997
Alternative
Indie Rock
3
Let It Bleed
The Rolling Stones 1969
Rock
Blues Rock
First normal
form (1NF)
ID
AlbumName
ArtistName
Year
Genre
1
Sister
Sonic Youth
1987
Noise Rock
Eliminating repeating groups and
columns
2
OK Computer
Radiohead
1997
Alternative
3
OK Computer
Radiohead
1997
Indie Rock
4
Let It Bleed
The Rolling Stones 1969
Rock
5
Let It Bleed
The Rolling Stones 1969
Blues Rock
Genre_2
Second normal
form (2NF)
Removing duplicate data sets
ID
AlbumName
ArtistName
Year
Genre
1
Sister
Sonic Youth
1987
Noise Rock
2
OK Computer
Radiohead
1997
Alternative
3
OK Computer
Radiohead
1997
Indie Rock
4
Let It Bleed
The Rolling Stones 1969
Rock
5
Let It Bleed
The Rolling Stones 1969
Blues Rock
AlbumID
AlbumName
Year
ArtistID
1
Sister
1987
7
2
OK Computer
1997
3
3
Let It Bleed
1969
2
Third normal
form (3NF)
Eliminating repeating groups and
columns
ID
PlayerName
SchoolID
StateID
1
David Thompson
1008
33
2
Christian Laettner
1012
33
3
Charles Barkley
2003
1
SchoolID
SchoolName
StateID
1008
North Carolina State
33
1012
Duke
33
2003
Auburn
1
Database keys
 Primary key
 Foreign key
Relationships and
applications
Can’t forget the skills you have with your analyst title, or who
you get to talk to more often
What can an
analyst do
next?
 Start using new skills to improve query performance
 Branching out into machine learning
 If you have knowledge of a statistical language, try using SQL as a
data source
The business unit
The typical
analyst bridge
The developers/engineers
The business unit
The bridge for
the analyst
and developer
The database administrator
So how do I practice?
One method for basic development practice, and of course
my method is best*
 Download the Microsoft data suite
Build a
database of
your own
 SQL Server 2016
 SQL Server Management Studio
 SQL Server Data Tools
 Start with 3-4 tables in the database with limited numbers of fields
 Use a few various data types for fields
 Come up with one-to-many relationships
 Find a file to practice data integration using SQL Server
Management Studio
 Maybe a CSV or an Excel you’ve been using for other practices
 This time, 5-6 tables, or…
 Find a practice database if you want to turn it up 15 notches
Make that
database
bigger
 Lahman’s Baseball Database (SQL version for MySQL, CSV for SQL
Server)
 Wide World Importers (Full and DW versions)
 Add true keys to the one-to-many relationships that don’t have
these yet
 Use your analysis skills to build some stored procedures of your
own, no matter the complexity
 Run some UPDATE, DELETE, and INSERT statements
 Create a step in a current SSIS package to export this data from
your DB into Excel instead
Edit and
analyze the
database
 Create some of the reports and stored procedures you typically
run or build already
 Integrate the analysis
 Use of a statistical language
 Of course, if you get SQL Server 2016 for practice at home, use R
Services if you know the R language!
 Even veterans need to do this
 Plenty of online resources to use in the MS data space
Review with an
experienced
developer
 MSSQL Tips
 SQL Server Central
 MSDN
 Blogs that aggregate articles
 Curated SQL
 SQL Steve
 If you’re going to try this at your workplace, talk to your DBA first
so you don’t feel their wrath later
 Build your own database
The Choose
Your Own
Adventure
Review
 Utilize all the relationship
types
 Build a more complex
database
 Try setting up an integration
procedure from some data
files
 Try out building the stored
procedures yourself
 Ones that use the WRITE
commands
 Get some feedback from
experienced developers
 Know your limits
Any final questions?
Or, who is hungry for lunch?
Thank you for
your time!
 Website: ceedubvoss.com
Remember to post feedback on the
Guidebook app!
 LinkedIn: www.linkedin.com/in/cwvoss
 Twitter: @ceedubvee