Download SQL_Server_2016_-_Polybase

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
SQL Server 2016
PolyBase
Sean Werick
Principal Consultant
502.320.2918
[email protected]
SQL Server 2016 - Industry leading TCO
R SERVER
IN-DB
ADVANCED ANALYTICS
MOBILE
BI INTELLIGENCE
& SELF-SERVICE BI
BUSINESS
$320K
SQL Server 2016
Everything
built-in
#1
TPC-H—DW
DATA
WAREHOUSING
BUILT-IN
ETL
ETL
$3,745,000
$3,433,000
$1,272,000
$803,000
$640,000
11.7x
more
OLTP
INDUSTRY
LEADER—OLTP
Built-in with SQL Server vs.
expensive add-ons with Oracle
In-memory
built-in
End-to-end security
built-in
Advanced Analytics
built-in
Complete mobile BI built-in
SQL Server 2016: Everything built-in
built-in
built-in
built-in
built-in
built-in
$2,230
80
69
70
SQL Server
60
50
43
40
34
SQL Server
$480
29
30
2220
6
0
4
1
0
22
18
15
20
10
SQL Server
49
0
5
3
3
0
2010
SQL Server
2011
2012
Oracle
2013
MySQL
2014
#1
#2
#3
2015
SAP HANA
Oracle
is #5
$120
Microsoft
TPC-H
Tableau
Oracle
Self-service BI per user
at massive scale
In-memory across all workloads
Consistent experience from on-premises to cloud
The above graphics were published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Microsoft. Gartner does not endorse any vendor, product or service depicted in its
research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all
warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
National Institute of Standards and Technology Comprehensive Vulnerability Database update 10/2015
TPC-H 10TB non-clustered results as of 04/06/15, 5/04/15, 4/15/14 and 11/25/13, respectively. http://www.tpc.org/tpch/results/tpch_perf_results.asp?resulttype=noncluster
8
In-memory enhancements
Operational analytics & enhanced performance
In-memory
SQL Server
ColumnStore
data warehouse
2-24
hrs
ETL
Fraud detected
Real-time
fraud
detection
0100101010110
In-memory
SQL Server
OLTP
Built-in advanced analytics
In-database analytics at massive scale
Example Solutions
Extensibility
• Sales forecasting
R Integration
• Warehouse efficiency
• Predictive maintenance
R
?
New R scripts
• Credit risk protection
010010
100100
010101
Analytic Library
010010
100100
010101
Data Scientist
Interact directly with data
010010
100100
010101
T-SQL Interface
Relational Data
Built-in to SQL Server
Data Developer/DBA
Manage data and
analytics together
010010
100100
010101
010010
100100
010101
Microsoft Azure
Marketplace
R integration and advanced analytics
Capability
SQL Server
Data Scientists
Analytics library
Publish algorithms, interact
directly with data
Share and collaborate
Manage and deploy
Analytical engines
Full R integration
Fully extensible
DBAs
R
+
Manage storage and analytics
together
Data Management Layer
Business Analysts
Relational data
Analysis through TSQL, tools,
and vetted algorithms
T-SQL interface
Stream data in-memory
Advanced analytics
Extensible in-database analytics, integrated with R,
exposed through T-SQL
Centralize enterprise library for analytic models
Benefits
Always Encrypted
Help protect data at rest and in motion, on-premises & cloud
Apps
SQL Server
Trusted
Client side
SELECT Name FROM
Patients WHERE SSN=@SSN
@SSN='198-33-0987'
Column
Master
Key
Result Set
Name
Jim Gray
Query
SELECT Name FROM
Patients WHERE SSN=@SSN
@SSN=0x7ff654ae6d
Enhanced
ADO.NET
Library
Result Set
Column
Encryption
Key
Name
Jim Gray
ciphertext
dbo.Patients
Name
SSN
Country
Jane Doe
243-24-9812 USA
1x7fg655se2e
Jim Gray
198-33-0987
0x7ff654ae6d
USA
John Smith
123-82-1095
0y8fj754ea2c
USA
RLS in three steps
Two
App user (e.g., nurse) selects from Patients table
Security Policy transparently rewrites query to apply filter predicate
Nurse
Database
Three
Security
Policy
Patients
Application
SELECT * FROM Patients
Policy Manager
Filter
Predicate:
INNER
JOIN…
CREATE FUNCTION dbo.fn_securitypredicate(@wing int)
RETURNS TABLE WITH SCHEMABINDING AS
return SELECT 1 as [fn_securitypredicate_result] FROM
SELECT *StaffDuties
FROM Patients
d INNER JOIN Employees e
SEMIJOIN
APPLY dbo.fn_securitypredicate(patients.Wing);
ON (d.EmpId
= e.EmpId)
WHERE e.UserSID = SUSER_SID() AND @wing = d.Wing;
SELECT
FROM Patients,
CREATE Patients.*
SECURITY POLICY
dbo.SecPol
StaffDuties
d
INNER
JOIN
Employees e ON (d.EmpId = e.EmpId)
ADD FILTER PREDICATE dbo.fn_securitypredicate(Wing)
ON Patients
WHERE
= SUSER_SID() AND Patients.wing = d.Wing;
WITH e.UserSID
(STATE = ON)
Security
Dynamic data masking walkthrough
2)
Application
user
selects
from
Employee
table
1) Dynamic
Securitydata
officer
defines
dynamic
data masking
policydata
in T-SQL
sensitive
3)
masking
policy
obfuscates
the sensitive
in theover
query
results data in Employee table
ALTER TABLE [Employee] ALTER COLUMN [SocialSecurityNumber]
ADD MASKED WITH (FUNCTION = ‘SSN()’)
ALTER TABLE [Employee] ALTER COLUMN [Email]
ADD MASKED WITH (FUNCTION = ‘EMAIL()’)
ALTER TABLE [Employee] ALTER COLUMN [Salary]
ADD MASKED WITH (FUNCTION = ‘RANDOM(1,20000)’)
GRANT UNMASK to admin1
SELECT [Name],
[SocialSecurityNumber],
[Email],
[Salary]
FROM [Employee]
Security
Monitoring performance by using the Query Store
Capability
Query Store helps customers quickly
find and fix query performance issues
Query Store is a ‘flight data recorder’
for database workloads
Benefits
Greatly simplifies query performance
troubleshooting
Provides performance stability across
SQL Server upgrades
Allows deeper insight into workload
performance
Performance
Stretch SQL Server into Azure
Stretch warm and cold tables to Azure with remote query processing
Microsoft Azure
Jim Gray
Order history
Name
Jane Doe
Jim Gray
John Smith
Bill Brown
ox7ff654ae6d
3/18/2005
Stretch to cloud
SSN
Date
Customer data
2/28/200
cm61ba906fd
5
Product data
3/18/200
ox7ff654ae6d
5
Order History
4/10/200
i2y36cg776rg
5
4/27/200
nx290pldo90l
5
5/12/200

Query
App
New Portal
Updated Report Features
New KPI Features
Mobile Reporting
Brand the Portal
SQL Server 2016 Features
So many great features,
but which are included
with which version?
https://www.microsoft.com/en-us/server-cloud/products/sql-server-editions/
Deeper Insights Across
Data with PolyBase
What is PolyBase
What is PolyBase
PolyBase
Query relational and non-relational data with T-SQL
Query relational
and non-relational
data, on-premises
and in Azure
T-SQL query
SQL Server
Apps
Access any data
Hadoop
PolyBase
Query relational and non-relational data with T-SQL
Quote:
************************
T-SQL query
**********************
*********************
**********************
***********************
SQL Server
Name
DOB
State
Jim Gray
11/13/58 WA
Ann Smith
04/29/76 ME
Hadoop
$658.39
PolyBase Can…
PolyBase Performance
Polybase scale-out groups
PolyBase Requirements
SQL Server (64-bit)
Java SE downloads
Enable or Disable a Server Network Protocol
Setting up PolyBase
1.Install PolyBase
a) PolyBase Data Movement Service
b) PolyBase Engine
2.Configure SQL Server and enable the option
3.Configure Pushdown
4.Create external data source
5.Create external file format
6.Create Hadoop user
7.Create external table
PolyBase Configuration
First, is it enabled?
SELECT SERVERPROPERTY ('IsPolybaseInstalled')
-- 5 denotes the connection type
EXEC sp_configure 'hadoop connectivity', 5;
RECONFIGURE;
Option 0: Disable Hadoop connectivity
Option 1: Hortonworks HDP 1.3 on Windows Server
Option 1: Azure blob storage (WASB[S])
Option 2: Hortonworks HDP 1.3 on Linux
Option 3: Cloudera CDH 4.3 on Linux
Option 4: Hortonworks HDP 2.0 on Windows Server
Option 4: Azure blob storage (WASB[S])
Option 5: Hortonworks HDP 2.0 on Linux
Option 6: Cloudera 5.1, 5.2, 5.3, 5.4, and 5.5 on Linux
Option 7: Hortonworks 2.1, 2.2, and 2.3 on Linux
Option 7: Hortonworks 2.1, 2.2, and 2.3 on Windows Server
Option 7: Azure blob storage (WASB[S])
PolyBase Configuration
Restart:
• SQL Server
• PolyBase Data Movement Service
• PolyBase Engine
PolyBase Configuration
PolyBase Configuration
EXEC sp_polybase_join_group 'PQTH4A-CMP01', 16450, 'MSSQLSERVER';
PolyBase Configuration
Create Scoped Credential
USE [AdventureworksDW]
GO
-- 2: Create a database scoped credential
for Kerberos-secured Hadoop clusters.
-- IDENTITY: the user name
-- SECRET: the password
CREATE DATABASE SCOPED CREDENTIAL HDPUser WITH IDENTITY = 'hue', Secret = '';
GO
PolyBase Configuration
Create External Data Source
USE [AdventureworksDW]
GO
CREATE EXTERNAL DATA SOURCE [HDP2] WITH
(TYPE = HADOOP,
LOCATION = N'hdfs://pwpchadoop.cloudapp.net:8020',
CREDENTIAL = HDPUser);
GO
PolyBase Configuration
CREATE EXTERNAL FILE FORMAT TSV
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (
FIELD_TERMINATOR = '\t',
DATE_FORMAT = 'MM/dd/yyyy'
)
)
PolyBase Configuration
CREATE EXTERNAL TABLE HDP_FactInternetSales
([ProductKey] [int],
[OrderDateKey] [int],
[DueDateKey] [int],
[ShipDateKey] [int],
[CustomerKey] [int],
…)
WITH
(LOCATION = '/apps/hive/warehouse/factinternetsales',
DATA_SOURCE = HDP2,
FILE_FORMAT = TSV,
REJECT_TYPE = value,
REJECT_VALUE=0)
Using PolyBase
SELECT Insured_Customers.FirstName, Insured_Customers.LastName,
Insured_Customers.YearlyIncome, Insured_Customers.MaritalStatus
INTO Fast_Customers from Insured_Customers INNER JOIN
(SELECT * FROM CarSensor_Data where Speed > 35) AS SensorD
ON Insured_Customers.CustomerKey = SensorD.CustomerKey
Using PolyBase
-- Enable INSERT into external table
sp_configure 'allow polybase export', 1;
Reconfigure;
-- Create an external table.
CREATE EXTERNAL TABLE [dbo].[FastCustomers2009] (
[FirstName] char(25) NOT NULL,
[LastName] char(25) NOT NULL,
[YearlyIncome] float NULL,
[MaritalStatus] char(1) NOT NULL)
WITH
(LOCATION='/old_data/2009/customerdata.tbl',
DATA_SOURCE = HadoopHDP2,
FILE_FORMAT = TextFileFormat,
REJECT_TYPE = VALUE,
REJECT_VALUE = 0);
Using PolyBase
SELECT DISTINCT Insured_Customers.FirstName, Insured_Customers.LastName,
Insured_Customers.YearlyIncome, CarSensor_Data.Speed
FROM
Insured_Customers, CarSensor_Data
WHERE
Insured_Customers.CustomerKey = CarSensor_Data.CustomerKey and
CarSensor_Data.Speed > 35
ORDER BY CarSensor_Data.Speed DESC
OPTION (FORCE EXTERNALPUSHDOWN);
-- or OPTION (DISABLE EXTERNALPUSHDOWN)
Using PolyBase
SELECT customer.name, customer.zip_code
FROM customer
WHERE customer.account_balance < 200000
Using PolyBase
PolyBase Troubleshooting
-- Find the longest running query
SELECT
execution_id, st.text, dr.total_elapsed_time
FROM
sys.dm_exec_distributed_requests
dr
cross apply sys.dm_exec_sql_text(sql_handle) st
ORDER BY total_elapsed_time DESC;
-- Find the longest running step of the distributed query plan
SELECT
execution_id, step_index, operation_type, distribution_type,
location_type, status, total_elapsed_time, command
FROM
sys.dm_exec_distributed_request_steps
WHERE
execution_id = 'QID4547'
ORDER BY total_elapsed_time DESC;
PolyBase Troubleshooting
-- Find the execution progress of SQL step
SELECT execution_id, step_index, distribution_id, status,
total_elapsed_time, row_count, command
FROM sys.dm_exec_distributed_sql_requests
WHERE execution_id = 'QID4547' and step_index = 1;
PolyBase Troubleshooting
SELECT execution_id, step_index, dms_step_index,
compute_node_id,
type, input_name, length, total_elapsed_time, status
FROM sys.dm_exec_external_work
WHERE execution_id = 'QID4547' and step_index = 7
ORDER BY total_elapsed_time DESC;
PolyBase Troubleshooting
Start Playing
Developer Edition
SQL Server Data Tools For Visual Studio 2015
https://www.microsoft.com/en-us/server-cloud/products/sql-server-editions/sql-server-developer.aspx
https://msdn.microsoft.com/en-us/library/mt204009.aspx
Pragmatic Works Offers a Variety of Services to Help You with SQL Server
- Architectural Design Sessions
- Detailed Assessments and Roadmaps
- Migrations and Upgrades
Ask us about test-driven migrations and upgrades with Legitest
Sean Werick
[email protected]
Related documents