Download StretchDB-joeyong

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Concurrency control wikipedia , lookup

Tandem Computers wikipedia , lookup

Oracle Database wikipedia , lookup

Microsoft Access wikipedia , lookup

Team Foundation Server wikipedia , lookup

Ingres (database) wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Functional Database Model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Transcript
SQL SERVER
STRETCH
DATABASE TOP
TO BOTTOM
JOE YONG
SR. PROGRAM MANAGER
M I C R O S O F T C O R P.
LIGHTNING POLL
 When you think about your organization’s enterprise
storage or SAN experience are you
1
5
10
 What is your data center footprint and DBA team size
compared to database size and complexity growth?
 Thinking of your largest OLTP databases, how many rows in
the largest tables? What percentage is cold data?
 If you have databases with very large tables
 What is your maintenance (e.g. re-index, performance tuning)
experience?
 Is there impact to their availability and performance SLAs?
ACTUAL CUSTOMER RESPONSES
 Is there a 0 or negative rating?
 I want to “talk” to the person who came up with that “do more with less” thing
 45 billion rows in a single table (99% cold)
 1.3 trillion rows in a partitioned table (>80% cold)
 8TB of data in a single table
 Re-index? That’s not funny man
 We can’t even update statistics beyond 10% sample
 We are down for about one day if we need to do a full DB restore though
about 70-80% of the data is cold
 ~11TB database, no backup, no DBCC CHECKDB in 1.5 years
WHY STRETCH DATABASE?
Ever growing databases
Existing solutions
 Massive tables (hundreds of
millions/billions of rows, TBs size)
 Keep adding storage and
datacenter capacity
 Users want to keep all their data and
use it too
 Closed business (cold) data accessed
infrequently but must be online
 Does not solve maintenance
and SLA issues (unless you put
cold data in SSDs)
 Maintenance challenges (e.g. re-index)
 Move data to another location –
database, flat-files, backups, tapes
 Business SLAs at risk (e.g. restore time)
 Data is near-line or offline
 Delete older data
 Unknown if data is valuable
 Unacceptable or illegal in some
organizations/industries
STRETCH SQL SERVER INTO AZURE
Stretch Database
Microsoft Azure
Mike Wan
Order history
Name
ox7ff654ae6d
3/18/2005
Stretch to cloud
SSN
Steve Karimi
Date
Customer data
cm61ba906fd
2/28/2005
Mike Wan
ox7ff654ae6d
3/18/2005
Product data
Sapna Matuszyk
SQL Mojoe
4/10/2005
i2y36cg776rg
Order History
nx290pldo90l
4/27/2005
Blazej Zwilling
ypo85ba616rj
5/12/2005
Cheng Jeswani
bns51ra806fd
5/22/2005
Nikhat Lindell
mci12hh906fj
6/07/2005
Whitney Lang
utb76b916gi
6/18/2014
Lorenzo Olds
px61hi9306fj
7/1/2014
Sophie Cook
ol43bi506gd
7/12/2014
Aida Durham
tx83hal916fi
7/29/2014
Jaclyn Wade
nb95re926gi
8/10/2014
Adrienne Battle
vc61ira536fe
8/23/2014

Query
App
Stretch cold tables securely from
SQL Server databases to Azure
with remote query processing
Benefits
 Cost effective online cold data
 Entire table is online and
remains accessible from existing
apps
 No application changes
 Addresses hard problems
arising from very large tables
 Secure by default with optional
additional security with Always
Encrypted and Row Level
Security
SECURE BY DEFAULT (AND THEN SOME)
 Data in motion always via
secure channels (TLS1.2 / 1.1)
 Supports Always Encrypted
 Encryption key remains onpremises
 Row level security
 SQL Server and SQL Azure
audit
 Integrated authentication
(pass-through) with AAD
integration
KEY TAKEAWAYS
 Secure communication channel is mandatory
 Additional SQL Server data encryption technologies
work transparently
 Adheres to SQL Sever security hierarchy
 SQL Server and Windows integrated authentication
 No change to authorization, permissions model
 Audit SQL Server and/or Azure SQL Stretch database
TRICKLE DATA MOVEMENT
Azure SQL Stretch
Database
On-premises
application(s)
On-premises
SQL Server
instance
Trickle data migration
DB in SAN/Local
Storage
SQL Server
Ord_Detail
Txn_detail
Ord_detail_
archive
table
Hot + cold
rows in
same table
Compute
Storage
Ord_detail_
archive
table
Txn_detail
(cold rows
only)
Entire archive Only cold rows
moved
table moved
KEY TAKEAWAYS
 Transactionally consistent data movement
 Moved in batches and tracked with batch IDs
 Fixed batch sizes in SQL Server 2016 release
 Optimized for low impact to existing production
workload
 DBAs have some control by scheduling pause/resume
via SQL Agent
SMART QUERY PROCESSING
Azure SQL Stretch
Database
Local +
Local data
remote data
only queries
queries
On-premises
application(s)
On-premises
SQL Server
instance
Trickle data migration
Transparent remote
data access
DB in SAN/Local
Storage
SQL Server
Compute
Ord_Detail
Txn_detail
Ord_detail_
archive
table
Hot + cold
rows in
same table
Storage
Ord_detail_
archive
table
Txn_detail
(cold rows
only)
KEY TAKEAWAYS
 All cases
No modifications to existing queries
Common optimization techniques still apply
Azure SQL Stretch DB service will “get smarter” without user
action (tune, upgrade, etc…)
Users can increase/decrease performance level of Azure SQL
Stretch DB as needed
 Stretch entire table
Queries always pushed to local and remote
 Stretch cold rows only (predicate based stretch)
Some smarts about query execution – predicate injection to
query tree during algebrazation so optimizer can prune
remote branch if appropriate
HIGH AVAILABILITY AND DISASTER
RECOVERY
 Supports existing SQL Server
HA/DR solutions, except
Backup/restore
 DB mirroring – deprecated for SQL
Server
 Replication – not supported for
StretchDB in general
 DBAs backup/restore local SQL
Server hot data only
 StretchDB ensures remote data
is transactionally-consistent with
local
 Upon completion of local
restore, SQL Server reconciles
with remote using metadata
BACKUP & RESTORE
Azure SQL Stretch
Database
SQL Server
On-premises
SQL Server
instance
On-premises
application(s)
Compute
Storage
“Shallow”
backup
Regular backup
& restore
restore
&
Table 1
Table 1’
Table 2
Table 2
Table 2
DB in SAN/Local
Storage
Ord_detail
Txn_detail
Ord_detail_arc
hive
table
Txn_detail
(cold rows
only)
KEY TAKEAWAYS
 Existing HA/DR technologies and practices still apply
 Always On Availability Group with readable secondary works
transparently
 Backup/restore experience improves after most of the cold data
has been migrated to Azure
 Users backup/restore SQL Server database and logs
 Upon completion of restore, run sp_reauthorize_db – triggers
reconcile
 Full data copy operation if restore with copy of Azure SQL
Stretch database
 SQL Server database restore time and reconcile with metadata if
restore in-place for Azure SQL Stretch database (e.g. 20TB
database, 80% cold data, restore time is restore 4TB SQL Server
and reconcile with Azure SQL Stretch database)
CURRENT LIMITATIONS AND NON-USE
CASES
 Limitations
 Certain data types, replication, computed columns, etc…
 Table with primary key in a PK/FK relationship – cannot stretch parent
tables, child tables ok
 Cross database constraints not supported
 Detailed list at https://msdn.microsoft.com/enus/library/mt605114.aspx
 Non-use cases
 Not for frequently accessed data (hot)
 Can connect directly to Azure SQL Stretch database but not
recommended
 Does not alleviate performance/scale issues for application that have
inefficient design/queries
 Can amplify existing performance issues
COMMON CONCERNS
 We do not trust public clouds
 Encrypt sensitive data with Always Encrypted, keep encryption key on-premises;
Microsoft never has access
 Azure is designed to meet/exceed security and compliance requirements with regular
tests and audits
 Price is higher than expected
 Current entry level optimized for dense/premium compute
 Lower entry level work in progress
 No customer should be blocked by pricing, contact [email protected]
for help
 Licensing and capacity planning is really complicated
 StretchDB enabled on all editions
 Pay based on consumption; storage and compute charged separately
 What is with the PK/FK, unique constraint, XML type and other limitations?
 Only stretch child table in parent/child relationship today
 Assumes uniqueness already validated before data turns cold
 Most limitations are short term and will be removed gradually in future
iterations based on customer feedback
STRETCH DATABASE TAKEAWAYS
 Works without code change for most applications
 Can save customers a lot more than money – business
SLA, performance, administration, design simplicity
 Secure by default and supports advanced security
requirements
 Existing HA/DR solutions works as-is, may have
significant restore time reduction for some scenarios
 Performance will be at reasonable levels for cold data
scenarios; not intended for active OLTP data
 SQL Server 2016 targets customers with large amounts
(TBs) of cold data
WE WANT TO HEAR FROM YOU
 Customer engagements, technical/business questions,
project issues
[email protected]
 Product/feature feedback, bugs, feature requests
https://connect.microsoft.com/SQLServer/Feedback
 Twitter
#stretchdb
A BIG THANKS TO OUR SPONSORS