Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
IBM DB2 Top Ten Technologies Transcript Sid Misra and Irshad Rainhan, Presenters Slide 1 Welcome to the IBM® DB2® Webcast Series for Oracle Professionals. Running throughout 2011, this series is designed to help you learn about DB2 in a way that is fast and fits easily with your schedule. This Webcast is the first in series and is an overview of the 10 technologies that you should consider in DB2. You will be hearing more about these features, in more depth, in the upcoming Webcast sessions. I am Sid Misra of the DB2 product marketing team and I have Irshad Raihan, also in the DB2 marketing team, online with me today. Irshad and I will be presenting this Webcast to you. So thanks for your attention and let's get started. Slide 2 So the first technology which I’ll be talking about today is DB2 performance. Slide 3 So, IBM DB2 was specifically designed to operate with great efficiency and as a result, has a long and strong track record of leadership in database benchmarks. Now this chart represents the number of days of performance leadership of three important industry performance benchmarks. And this is for over seven years. And as you can see on the chart, DB2 has been the leader for over twice as long as compared to the Oracle database for the TPC-C benchmark -- which is for transactional workloads. Now for SAP 3-Tier SD as benchmarks, DB2 again has a dominant position and leads Oracle by over ten times in terms of days of leadership. As you can see on the chart, DB2 has similar leadership in SPECj benchmark as well. So you would wonder why are the days of leadership important in benchmarks? Well industry benchmarks are essentially like a game of leap frog. And vendors like Oracle, IBM, Microsoft are continuously optimizing to outperform one another and the company which has been in the lead position varies from time to time. So a more telling statistic of benchmarks is days of leadership -- which vendor has been at the top for the longest. And as you can see, DB2 has significantly dominated the industry over time in performance benchmarks. Slide 4 Now Chart 4 -- let’s dig a bit deeper into the TPC-C performance here. Now of all the performance benchmarks, TPC-C is the most prominent and the most important transaction processing benchmark in the industry. Well here is some data from the TPC-C benchmark. Compare the amount of throughput per core for Oracle/ Sun’s best TPC-C results with HP Itanium2’s best results. And then compare it with the next generation of IBM Power Systems ™. Now on the chart, you can see Power 5 and Power 6 and 7 chip sets. IBM provides far better throughput per core than Oracle/Sun and HP Itanium. So why is this important in benchmarks? So this is critical because software is licensed by CPU core. So if you need less CPU cores for your software then it reduces your initial acquisition cost and it can also reduce your ongoing software support and maintenance cost. And also having less CPUs, reduces the complexity of your IT environment and helps you manage the future growth of your IT environment more efficiently. Slide 5 ® So the second technology which I want to cover today is DB2 pureScale . IBM introduced DB2 pureScale to address the increasing business need for scale-out ® efficiency. This technology is actually based on System z , which has been a gold standard for high availability and scalability for many years. DB2 pureScale provides you unlimited capacity, continuous availability, and application transparency to reduce risk and cost as your business grows. Slide 6 On the next slide, this is actually something from a public record. It shows the results of some tests that were conducted by Dell -- analyzing the scalability of Oracle RAC. Now on the graph on the chart, the X-axis represents the actual nodes on the cluster and the Y-axis shows the effective nodes. The green line on the graph shows near-linear performance, while the red line shows the effective performance. So as you can see on the graph, [when] adding nodes to your Oracle RAC, the performance doesn’t climb linearly. For example at four nodes, the system performs as if the cluster contains less than two nodes. Similarly at eight nodes, the system performs as if the cluster contains less than three nodes. So with Oracle RAC, you’re spending all this money to build your system -- but by adding nodes, you’re not getting the same value in terms of increase in scalability and performance. Slide 7 The next chart compares the scalability of Oracle RAC to DB2 pureScale. Now as you can see, DB2 pureScale has near-linear scale-out efficiency. DB2 pureScale came to the market a couple of years ago and it’s similar to Oracle RAC -- it uses similar architecture and a similar high level approach. However, once you get below that highest level description, it takes a very different approach. So Oracle RAC distributes the management across the nodes, whereas DB2 pureScale uses centralized management. And this together with DB2 pureScale’s global caching, rarely has made a huge difference in both performance and reliability of DB2 pureScale. Now for reference, I showed you in the previous chart that Oracle RAC had an effective throughput of 1.69 nodes for a 4 node system. Now as we can see on this chart, DB2 pureScale has an effective throughput of 3.9 nodes for a 4 node system. So that’s correct -- DB2 pureScale gives you more than double the throughput. And this difference gets even bigger when you add more nodes. So at eight nodes, Oracle gives you 2.44, whereas DB2 pureScale gives you 7.54 effective nodes. So that is three times more throughput. So Oracle RAC’s inefficient scaling is only a waste of your precious dollars in hardware and also software because as I mentioned, software running on the boxes is priced by a processor core. So essentially you are paying more for your software even though your hardware is not scaling to the level you desire. -2- Slide 8 So the third technology which I’ll be covering today is deep compression. Slide 9 Storage costs continue to be a concern for customers. Data has been growing exponentially and with that, the storage cost has been also growing exponentially for customers. And DB2 9.7 has introduced a deep compression technology which can help you deliver storage savings. DB2 supports index temporary table as well as XML compression. For index compression, depending upon the type of index and the data distribution within that index, DB2 will automatically choose from a set of compression algorithms that will provide an optimal compression for that index. Similarly, DB2 has automatic compression for temporary tables as well. And this can be huge savings especially in warehouse applications, where large sort and large intermediate data can consume a significant amount of storage space and temporary tables. DB2 also applies intelligent compression techniques to XML to further reduce your storage costs. Now 9.7 compresses both XML and it is “inline” as well as when it is contained in its XDA object. Now a lot of our customers have experienced tremendous savings with compression. Let alone with data and index compression, we’ve seen customer saving around 68% to 70% in storage savings. And I have also seen a significant improvement in performance as well. Now if you add XML compression to this, the savings are even higher -- they are up to 75%. And remember this includes only the online database storage needs. But when you consider backup and recovery databases, savings with compression are even higher than that. Slide 10 So the next chart, Chart 10, is from a study which is conducted by Triton Consulting. So what they did was, they analyzed the complexity of performing some routine tasks on both Oracle 11g as well as DB2 9.7. And for the purpose of the study, they got Oracle database DBAs with over 10 years of experience, which is quite an experienced DBA. And they got the equivalent DBA on the DB2 side. And then they created a methodology of assessing the complexity of several routine DBA tasks. Now this chart shows the results of the complexity analysis for data and index compression. The first graph shows that data compression is 50% less complex with DB2 9.7 relative to Oracle 11g. And so in other words, it takes 50% less time to configure data compression with DB2 9.7. And with regards to index compression, DB2 is 100% less complex than the Oracle database. And the reason here is that with DB2, index compression and data compression are part of the same task flow. So essentially, you need no separate steps for configuring index compression with DB2. So you can see that with deep compression, it’s very powerful technology in DB2 9.7 and it’s also very simple to implement with DB2 9.7. Slide 11 So the fourth technology which I’ll be covering today is autonomics and administration. -3- Slide 12 In recent years, DB2 has focused on making it easier for you to administer your database system by automating some routine tasks. You not only get optimized performance but more importantly, it frees up DBA time to work on more high value tasks. Now with DB2 9.7, DB automates database maintenance, so tasks like runstats, database restore, and backup utilities can all be automated. And you have a user friendly visitor that helps you walk through the process. DB2 also heals itself. So DB2 has a facility called the Health Center. And the Health Center allows you to set up thresholds for various database warnings and alarms. You can configure DB2 to use health center recommendations to respond to these warning and alarm situations. DB2 can also tune itself and this is a very powerful feature. DB2 has self-tuning memory manager, STMM, that actually simplifies the task of memory configuration by automatically setting up values for several memory configuration parameters. So STMM is very easy to configure. It is actually a two-step process if all the default values are used. Now in comparison, undertaking memory tuning in the Oracle database environment is complex and it involves multiple operating system checks … and several memory parameters that sometimes require a full database restart. So whether you’re a junior or a senior DBA, STMM can save you hours of tuning time or days of tuning time for sure. So [it is a]very powerful feature. Slide 13 So the next chart here, Chart 13 is again from the Triton study which I talked about earlier. And now this shows the results of complexity analysis of six routine DBA tasks … they analyzed installation; data compression and index compression which we talked about earlier; backup and recovery; automatic memory management; and, authorization. Based on the results of the study, DB2 has a clear and overwhelming advantage on all six routine DBA tasks that were evaluated. And as an example, let’s talk about auto memory management which we talked about earlier in the previous slide. So configuring auto memory management on DB2 is 90% lower in complexity than in the Oracle database. So the report actually shows that automatic memory management tasks in the Oracle database for a specified environment, could take 100 minutes of DBA interaction time to complete. And in contrast, the same automatic memory management tasks for DB2 would take just 10 minutes. So that’s huge saving of time for a DBA. So DB2’s simplicity, relative to Oracle database, translates into less DBA time spent on these routine DBA tasks … also less time in training your new staff … and lower risk of errors that can impact the quality of service. So again, the autonomics and the administrative features have definitely helped the DBAs a lot by making their lives so much easier, by saving them time and letting them focus on more high value tasks. Slide 14 So the next technology which I want to talk about is the SQL capability or SQL skill technology. -4- Slide 15 So the SQL compatibility technology of SQL skills in DB2 9.7 has caused a paradigm shift in the world of database migration -- by allowing customers to migrate to DB2 from Oracle or Sybase databases, in a matter of days or weeks, rather than months. The DB2 migration technology is a game-changer -- not only because it reduces the time and the cost of the actual migration, but also because it reduces the training and the development cost. So since DB2 has native support for Oracle PL/SQL syntax, applications built to run on Oracle databases requires few or no “changes” when run on DB2. What this also means is that DBAs can continue to use their PL/SQL skills even after migration. And this chart actually also shows a pretty interesting and telling quote from a notable Oracle expert, who has worked with Oracle for over 15 years. He calls the capability between Oracle and DB2, “freaky” … and at a level which he has not seen in the enterprise database world. So this is a strong statement on the SQL capability technology. Slide 16 Now as I mentioned earlier, DB2 supports PL/SQL natively. So with DB2 9. 7 starting all the way from PL/ SQL applications … to Oracle SQL dialect … to the concurrency model … to data types … built in functions … built in packages … SQLPlus scripts … Oracle JDBC metrics … and online schema changes -- all are supported natively by DB2. This chart also has an interesting graph on the right-hand side. And this shows the results of the DB2 early access program. Before we went to market with this capability, we had a very strong beta testing period -- and it was a very long period as well. It lasted for a year and we had several hundred companies which participated in this beta program, ranging from different industries that had different solutions, application sizes, from different parts of the world. With all these companies, we worked very closely to do a deep analysis of the code. And you can see the results of these analyses on the chart. What we really wanted to measure here was, how much of their Oracle PL/SQL code worked on DB2 out-of-the-box … and how much of it needed tweaking. So what we found was that the lowest amount of code that was supported out-of-the-box was 90% -- and it varied up to 100%. So what we did was, we took an average of the 750K lines of code that we analyzed and we found that 98% of that was supported out-of-the-box. So that’s huge and we’ve got some great endorsements from a number of customers who worked in the beta program and have actually migrated to DB2. And one such quote is on the chart where the customer calls the compatibility, “amazing”. So at this point I’ll turn it over to Ishird to take you through the remaining five powerful and exciting technologies in DB2 9.7. Ishird, over to you. Thanks Sid. Hi everyone, this is Irshad Raihan, I work for the DB2 product marketing group within IBM’s information management business. And as Sid pointed out, I will be talking about the remainder of the five technologies. This is part of a developerWorks® article that is soon to be published and this will cover the top 10 DB2 technologies. They will be in a slightly different order. The way we ordered the technologies today was that we covered first five technologies that are getting a lot of press coverage as well as coverage from our customers. We are hearing a lot of great things around all the technologies that Sid just talked about from performance, -5- scalability, compression, autonomics and rounded out nicely by SQL compatibility, which is causing quite a stir in the market. DBAs and other practitioners are looking to this technology as a savior itself … because they are trapped with their current vendor and have no way out … because migration, in the past, has been a huge cost and risk, and it hasn’t always justified the move. Whereas now with SQL compatibility, which is both available for Oracle as well as Sybase customers, it has changed the game quite a bit. So, the [next] five technologies that I will be talking about are, as I said, they are probably not getting as much publicity but they are equally of importance and I will start with HADR, High Availability Disaster Recovery. Slide 17 Let me show you a picture quickly, this is a picture of the greenest data center in North America and it's also very highly available. Slide 18 Now with high availability and disaster recovery, really there are two thoughts here. There is the thought around making your data more available, making sure that when there is a transaction in the middle tier of your three-tier architecture, that is properly committed and then, there is some sort of confirmation that goes back to the middle tier, that when it left here the transaction was properly committed. And, you know, things go wrong and there are national disasters [and] there are other things that could happen that can bring down your system. And therefore, you want to make sure that there is transactional integrity across your three tiers. And at other times, there are planned outages. So, this is particularly a moot point for a lot of DBAs because there are a lot of things that you can do in DB2 while the system is still up that you cannot do with other databases. So, one of those is real-time schema changes, things like stored procedures that allow online movement of tables for instance … you can move tables online to a different table space … you can freely change columns used in views and other objects. You can even change certain data types within a column and that’s almost unheard of. For those of you who have tried this before, it definitely needs bringing down the database only for a few minutes. But it's something that can be done on the fly in DB2, for instance, if you were to change between compatible types such as integer to varchar or character to decimal, those are allowed in DB2 while your database is still running. There are also unplanned outages -- stuff happens and you want to make sure that all your inflight transactional data is captured as well as anything that needs to be rolled back onto your database, that was persistent, happens correctly as well. So, one of the features I want to talk about here is something known as autonomous transactions and what that really is -- is think of a nested transaction, right … think of a trigger on a table that is part of a transaction … that in turn calls another transaction within it. And the nested transaction is an autonomous transaction. Let's take an example of it. It stores the authorization ID of the person who is accessing that table and stores that into an audit field. So if your outer transaction fails and is rolled back, the fact that the inner transaction is completed is not lost because you want it to be able to record the fact that so-and-so’s authorization ID viewed this table and this amount of data from this time to this time, right? You -6- don't want that fact to be lost. supported in DB2. So things like autonomous transactions are The other aspect that has been added relatively recently, and I think that this is another game-changer, is read on standby. So typically, the way you would configure your HADR environment is you would have a primary server that essentially processes all the requests coming in and then you would have a standby server that essentially just hangs around. It's not processing any work and it's just waiting for the primary server to fail, right? And it takes over as soon as that happens within a few seconds, including things like inflight transactions, which is great. But in a environment where you are looking to cut cost, you are looking to increase utilization of all [resources], and you are looking to do more with your infrastructure, you don't really want service to be hanging out there twiddling their thumbs. Instead what you are able to do now with read on standby, which is a relatively new feature in DB2, is that you are able to run read only workloads on the standby servers. And what this means is that you can run things like business intelligence workloads or reports, things that can very quickly be sent to the back burner in case the secondary server needs to take over from the primary server if there is some kind of an unplanned outage. And quickly put those to the back burner and make sure that your actual workload, your real time workload, is being processed by the secondary [server]and then get back to it (read on standby) when the primary is back up. Slide 19 The other great thing about the way HADR works, I will talk to you about it on the next slide here, is if you look at the way it is set up -- you have two HADR servers on the top in the yellow shapes there, and then each of those have their own sets of logs. And you can set up, in this case, that the one on the left is the primary server and the one on the right is the secondary server. And if the primary goes down, as I mentioned, the secondary takes over and whenever the primary gets back up, it essentially becomes the secondary because what use to be the secondary has been storing inflight transactions and it has all the updated logs. So instead of handing it back over to the primary, it essentially takes over as the primary. Another aspects of HADR, that I want to mention here, that I think is a differentiator compared with the way the competition does it, is synchronous versus nearsynchronous versus asynchronous. And essentially what this is, is three different modes in which your transactions are committed to the database. And there is a little bit of a trade-off here and I will talk about that as soon as I explain those. So in the first case, let's talk about a commits transaction -- a commit request that is sent in. And in the first case, in the asynchronous case, as soon as the primary HADR server receives the request, it sends the information over to the secondary server and as soon as it sends that information over the TCP/IP connection, it sends back a confirmation to the user that the commit has succeeded. In a nearsynchronous case, when you have a commit request come in, the primary server sends the -- just as in the first case -- it sends the data that has to be stored into the log to the secondary HADR server. And once it has been received in the memory of the secondary HADR server, only then is a confirmation sent to the user. So as you can see, there is a little bit more of a guarantee here, right? Because in the first case, your TCP/IP connection … it may be a faulty connection and that sent/receive handshake may not take place properly … and so your secondary server may need -7- another try to actually replicate the information that exists on the primary server. So you have a little bit of a gap there. In the near-synchronous case, it is a little bit more of a guarantee. And in the third case, which is your asynchronous case, only when the data is actually written to the log on the secondary server is a confirmation sent back to the user. So, again, there is a little bit of a trade-off here. It will depend on your cost characteristics as well as your SLAs, the way they are defined in terms of high availability as well as performance. So something for you to think about … but I wanted to mention that you have a lot of ways in which you can configure HADR, which are superior to the competition. And also the point about [the opportunity to use] a combination of HADR for high availability and pureScale for a scaling up your application as Sid pointed out. I think it will give you a competitive advantage. And I don't need to tell you about the importance of HADR .. I will be preaching to the choir … because you know better than I that downtime is expensive, not just in the amount of business that is lost. You know, there are multiple examples of Web sites that have been down for a few hours, right, but it is not the business itself, which is of course significant, but more importantly, it's the loss in brand equity and it's the erosion of faith that your customers have in your systems. And that can be a much bigger factor, so this is something that is important and that needs to be given a lot of time and thought. Slide 20 Okay. So let's talk about the next technology here, which is data security and privacy. Again this is a very important topic and there are lots of different aspects to it. And more often than not, when finger pointing starts, everyone starts pointing at the DBA and the database practitioners regarding data security. When actually, there are hundreds of flaws in the system -- in just the way that technology is built, you have multiple attack points especially with the number of mobile devices out there today. There is hack that happens at the Web site levels all the time. We will talk about something called SQL injection in a little bit and talk about how that's causing a lot of headaches for IT administrators as well as database administrators. Slide 21 So, when you think about security, there are many questions that can be answered by DB2. There is a lot of instrumentation that has gone into the product that makes it very easy for you to, not only secure the boundaries of your database real time, but also be able to do almost a forensic analysis because there are fingerprints that hackers leave that will enable you to be able to track them down. So starting with who is accessing your data, DB2 has the ability to track all your connections and authorizations as well as being able to track what was changed. So [this is] the actual statement text that was changed down to the DDL that was actually processed in addition to where the request came in from, so there are application IDs and, of course, TCP/IP of the originating request, which many times can help you track down the perpetrator. Then there is also the ability to track when a certain event happened, I gave you the example of an autonomous transaction that tracks down to the point of who accessed what data, when … and that can give you a very good idea of where the problem might be. You also have the option of reporting how a certain a database action was processed -- whether a certain person had the necessary rights to process that transaction. Which brings us to the why . . . . With a label-based access control technology … it's been around a while but there are all sorts of new enhancements that have gone into this release of DB2 and in the -8- next release as well … and essentially what it lets you do is … let a security administrator create security policies that are based on labels. So it is very easy to define the label, then define the roles and then just map those two -- instead of having to grant individual users access. The other thing that is important here is around compliance. A security policy, really what it does is, it describes the criteria that we use to decide who has access to what data. One advantage of doing this is that you are able to protect sensitive data and are able to separate duties. So your DBAs can access your table schemas, they can look at the topography of your database, partitions and all of that; but, they are not able to look at your data. And there are a lot of compliance rules that dictate that, so DB2 is able to do that for you automatically. Slide 22 On the next chart 22, I want to talk a little bit about the actual tools that can help you do that. So there is … I talked about LDAC which is a feature inside of DB2, but there is also a plethora of tools that IBM offers that can help you through the life cycle of your information. And that's the key word there. It is really the life cycle. We don’t think security is something that happens only after deployment. It really starts all the way from development, [goes] through your deployment stage and it's an ongoing process. So if you are looking for tools to define policies, as well metrics to understand how secure your data is, you might want to read up about Infosphere® Guardium® or Data Architect or Discovery. There are multiple offerings in each of those areas. There are workshops. IBM is also running a series of information governance events that you might be interested in signing up for. These are free events and you can learn much more about how each of these tools fits in to the bigger picture, how they can help you set up much more robust information governance around your organization. You also have tools that can protect data across the enterprise from unauthorized use such as Data Reaction as well as Data Privacy and Encryption Expert. And finally you have a tool that can assess vulnerabilities and validate compliance automatically. This is important because a lot of companies are required to run periodic internal audits in addition to external audits. There are industries such as finance and healthcare, that have all sorts or regulatory rules around the way that data needs to be accessed … there are privacy issues and, you can only imagine with the explosion of data, these are only going to get more exaggerated. Therefore if you have tools that can help you do those audits and comply with those audits automatically, that not only reduces the time and effort required to perform the audit, but it's actually making your environment a lot safer … your database environment a lot safer. Slide 23 Okay, so I want to move on to the next topic here, which is one of my favorites – XML. IBM and DB2 have endorsed XML for a long time now -- even well before there were industry standards. Slide 24 Today you have XML and every industry has their own set of guidelines around … not just guidelines .. but also the standards such as ACORD for the insurance industry that dictate the way that XML needs to be processed within a database. And about four or five years ago, DB2 introduced XML … well we had the ability to process XML, -9- but what we did four or five years ago was to announce pureXML®, which was essentially a breakthrough technology because no one even today does it the way that DB2 does it. And you see on the picture in the bottom left on slide 24, you have your XML tree on the left and you have your relational database, RDB, tables on the right. And those two are married together inside of DB2. In the picture on the right, you will see there is a single query that uses information that accesses this table inside of your relational storage and in the same query, you have information that's being queried out of your XML tree. This is significant because as the picture here shows, DB2 swallows XML whole. I think it was Information Week that carried this article that made quite a stir -- not just because of the scary picture and, by the way, the reason that there was a snake on the cover was DB2’s code names were all snake-based and that particular release was Viper. So it's not probably biologically correct because vipers aren’t constrictors but you get the point that the DB2 swallows XML whole. DB2 has the ability to offer XML natively. This is very significant for insurance companies, for government companies -- think about the tax forms . . . the tax season and every year, there are small changes to the tax code and so tax forms change constantly. When you are having to represent that information as relational columns instead of having to represent it as XML, it's a huge pain. And it's an even bigger pain, if you have an industry standard that dictates that you have to supply certain forms as XML because there are dictates around having online availability. There are all sorts of government forms for instance that have to be available online. And the easiest way to do that is to store them as XML. Now if you don’t have a capability to store XML as XML on your database, what you have to do is something known as shredding. Essentially what happens with shredding is an XML tree -- the one that you see in the in the figure on the bottom left -- inside of that figure, you see the XML DOM tree and each node of that tree is taken and shredded into multiple relational tables. So it's represented through multiple relational tables because the one thing that relational tables lack is the ability to show relationships … even though, they do show relationships in a different way, but not a relationship between the nodes. It's not easy to show hierarchy – those are the kinds of relationships I was referring to. It's not easy to show that soand-so nodes belong to so-and-so parents and have so-and-so children nodes, etc. And to be able to capture all of that information in relational tables takes a lot of time and effort. Therefore, companies that were and still are storing XML as relational tables have to incur all this overhead every time they take that XML and parse it into relational tables. And [when] they make a change … essentially the opposite has to be done. Translation has to be done and each of these tables, that the XML DOM tree has, has to be reconstructed on the fly and there is a lot of overhead going on there. And finally, as the data in those relational tables is stored as large objects or LOBs or BLOBs, that again just increases your storage. The way that DB2 does XML is we store it as XMLs. That’s number one -- it's not being stored as a LOB, it's being stored as XML and it's very easy to insert as well as to make any changes because it's just XML -- it's just so much easier, so much faster -- there is no overhead there. The other thing I want to say here [about] the DB2 approach to XML is the use of the Simple API for XML (SAX) and it is a parse once technology. This is again very important like I said, you parse the XML once and you store it in the DOM-like tree structure and now you have the performance and flexibility to get to the data and modify quickly. If you are a DBA and you have tried to add a column to your database -- sounds simple enough -- but the amount of work that goes into adding a column … running all the unit tests … running all the system tests … making - 10 - sure that you haven't broken anything else in the system … and then finally making the changes to your applications takes lot of work. With XML, it is as simple as adding a node to the tree. So there are tremendous advantages for you to store your data as XMLs, especially if your industry requires data to be stored as XML -there is even more advantage. Slide 25 One other thing I want to talk around XML and this is a figure that I stole from a flash book that is out there. This is a book on DB2’s nine features written by a bunch of authors from our Toronto lab and this particular figure talks about how XML works with scalability. Some of you might wonder, “well this is great news around XML but my data doesn’t exist in a single partition -- my data is spread across multiple database partitions” -- which is the right thing to do if you want scalability and you want to be able to use features such as DPF from DB2. Now what is interesting is that scalability services, they fully support pureXML. So after you hash partition your data, each row of a given table is placed in a specific database partition based in the hash valley of the table’s distribution. On reading or writing, DB2 is automatically able to direct work to the relevant partition. So there is no work on your part, as a DBA or a developer, to make that happen. It's happening automatically for you. Now why is it important? It's important because even if your data -- your XML data -- resides in multiple partitions, you can actually benefit from parallelizing some of those operations that require different parts of data. From the example here, you have sales of four quarters in four different partitions in your database and that’s in your create table clause, that's the way it that was specified. So if you want to run an annual report for instance, you are able to parallelize operations in each of these quarters and run the report. Obviously there are huge performance gains to be had. The other thing that I want to mention about this statement here, is the partition by range clause that some of you might have picked up on. Now this is important because the great thing about the XML implementation within DB2 is that you manage range-partitioned tables in exactly the same way that you would work with relational tables as you would with XML tables. So if you want to create tables that house your XML data, you would use your partition by range clause just as you would do with the relational data. Another aspect of this is multidimensional clustering tables. Those of you who are used to working with star schema especially with SAP environments, this is very important. If you want to use XML columns in multidimensional clustering, you use the organized by dimension clause of the create table and everything else stays the same. Slide 26 Okay, so there are two more aspects that I want to talk about here. The first is packaging flexibility. We believe that the more options we are able to offer our customers, the better fit that they will be able to find for their organization -- not just in terms of the addition of the database but all the way up through the stack. You might have heard of the workload optimized assistance paradigm -- and the point there is that we are looking to offer much flexibility to customers in the way that they build and grow these workload optimized systems in a modular fashion. And I want to talk to some aspects of that flexibility as it relates to DB2. - 11 - Slide 27 I want to start with sub capacity licensing. This is a very important aspect of the way that DB2 is licensed and sold and bought. The point here is that virtualization is everywhere. It makes a lot of sense … it increases utilization … and there is a lot of workload that can be managed nicely using virtualization. There are multiple ways you can achieve server virtualization. And one of the ways that you will install and deploy your databases in a virtualized environment is that -- think of a multi-core box that you install DB2 on -- if it is a six processor box and you install DB2 on only three of those processors, you will be charged on the three cores that you are using DB2 on. Just because the box has six cores does not mean you are going to be paying for six cores and that's very different from the way other vendors do it. They typically charge for the box and don’t care how many cores you are running their products on. This is significant. A lot of customers have come back and told us that this has helped them increase their utilization and also lower license costs which they could then use for other purchases that would more revenue oriented. The other point around flexibility is capacity on demand. Sid talked about pureScale and the way pureScale can be configured and purchased is you would just need to purchase additional license files and you can then add capacity to your cluster. So think of cyclical businesses … tax season is coming up … think of businesses that grow exponentially for six or eight weeks of the year and then they go back to a steady state which is significantly lower … think of businesses in retail that have huge spikes around Thanksgiving and Christmas and again, go back to somewhat of a flatter demand curve for the remainder of the year. If you are looking to increase capacity for a few weeks even down to days, pureXML can be licensed on a per day basis and therefore, you pay for the capacity that you use instead of overprovisioning for your peak capacity and then having all that infrastructure lie idle. We offer you the flexibility of paying as you go. The other point I want to make about flexibility is pureXML. I mentioned that any of you who have tried adding a column to a table [knows] it's a lot of work. With XML, you are able to, not only add columns, but you are able to change data on the fly. You are able to do a lot of manipulations to your data that are not possible in the way that other vendors store XML which is essentially as large objects. And finally I want to talk about the DB2 Advanced Enterprise Server Edition Slide 28 So there is flexibility in terms of the editions of DB2. We start all the way from DB2 Express-C which is a free version available to developers. A lot of students have to use it … a lot of professors use it to teach DB2 courses in the classroom … there are small companies even that will run DB2 Express-C. There are certain memory and processor limits but there are a lot of small companies-- little offices that don’t really need all that much and therefore they are able to use the free version of DB2. But in addition to that, you also have DB2 Workgroup and DB2 Express, which are typically oriented towards departmental applications. And on top of that, you have DB2 Enterprise Edition which essentially has all the high availability, the compression and pure XML -- all the different offerings under the DB2 umbrella. Now recently we announced another edition on top of the Enterprise Edition. [It is DB2 Advanced Enterprise Server Edition.] And this is great value because really if you look at the way it is priced, it is only about 10% more than the Enterprise Edition, but it packs a wallop. It's got storage optimization which is your decompression … it has got advanced access control, LBAC, I talked about earlier … it has got the workload - 12 - management features … it has got tools from Optim® … there are performance development tools and administration tools. And these are tools not just to manage DB2, but if you have Informix or Sybase or Oracle or Microsoft, it will help you manage heterogeneous environments as well. There is the replication feature as well as Federation Server. So there is a lot of value that has gone into this edition. There are no memory usage or processor core limits – [it is] basically limitless for the amount of memory that you want to add to those machines. Some of the other editions have certain limits but Enterprise Edition has neither of those and we are seeing customers come back and tell us that this is actually great value -- not just because of the amount [of features] that has gone in there but also if you will just look at pure pricing, this is less expensive than a bare bones Oracle license and that has significant value. So I just wanted the mention DB2 Advance Enterprise Server Edition as well. Slide 29 All right, so I wanted to wrap up today's technologies discussion with the last one … number 10 … again, last but not the least, very important -- is tools. Slide 30 What are all the different tools that IBM offers around productivity, performance, security, problems determination and that fuzzy thing called the Cloud. So I will start with the Cloud. DB2 is available on the Cloud. We have been out there on the Amazon Cloud for many years now, even before the Amazon Cloud was really famous. And also DB2 has its own Cloud services platform through the IBM Cloud platform. What's interesting about the DB2 offering on the Cloud is that it is one of the most sought-after and used offerings on Cloud. Databases, by the way, by far are the most used software on the Cloud. And out of those, DB2 has seen some tremendous success on Clouds. So I just wanted to mention that in here as well and there are tools that will help you deploy and deliver out to the Cloud. And the great thing about the way DB2 is licensed and developed is, whether it's DB2 Express-C or Workgroup or Enterprise or DB2 on the Cloud, it is all the same core engine, which is great. We think of it as a Russian doll model where you have a complete replica of the doll inside of another doll and this is great because what it means is when you move up editions or when you move from a standard deployment to a cloud deployment, you don’t have to change your applications. It's all the same code that your applications are running against and therefore it's just a matter of changing around the licenses -- so that's an additional benefit as well. So to talk about the multitude of tools, I want to take an example of something called SQL injection. Now some of you might know the term and essentially what it is, is a type of hacker attack and the way it works is, there are certain data-rich applications that save user inputs in the database and they are not able to tell whether the input is SQL or it's an actual valid user input. And there is dynamic SQL that is generated on the basis of that input. So essentially what the hackers do is, instead of putting in your typical user-defined inputs, what they put in are pieces of SQL code that enter your system and generate dynamic queries. And that can be a huge problem and this is really a problem in the SQL standard itself. But it doesn’t make a distinction between control and data plan. And this is a huge issue. And DB2 has tools that can help you address that. So SQL injection affects confidentiality because databases generally hold sensitive data and loss of confidentiality is a frequent problem. - 13 - Authentication is another issue, for SQL commands are used to check user names and passwords. It may be possible to connect to a system as another user with no previous knowledge of the password. This is a huge authorization issue. If authorization information is held in a SQL database, it may be possible to change this information through successful exploitation of a SQL injection vulnerability and, of course, all of this leads to integrity issues. So that there are huge issues around SQL injections. There have been quite a few cases of it every year. The tools around DB2 can help you limit user access and reduce SQL injection because you are able to grant execute privileges on query packages versus access privileges on tables. That's a key difference there. Another differentiator is the acceleration to problem resolution. You are able to trace back SQL execution to a specific package and be able to pinpoint the originating source. You are also able to visualize application SQL and correlation metadata and increase system capacity and drive down database cycle. Slide 31 So there are tools . . . I have listed some of these tools on the next slide on 31 around application development performance management and availability of the database management. So again many of these tools run in heterogeneous environments. You are able to manage more than one database using a single tool but you are also able to not just look for vulnerabilities, but actually solve a lot of those problems. Slide 32 Okay, so that concludes the ten technologies and I hope you have enjoyed it. And I want to conclude with upcoming topics in this series. This was the first of the series Webcast for Oracle Professionals and this [series] will cover many other topics as we get through the year. The next one scheduled is the Advanced Enterprise Server Edition that I just talked about -- SQL compatibility that Sid talked about, data storage, pureXML, SAP, DB2 for SAP -- that's surely going to be an interesting one. And there is a bunch for other topics that are scheduled. If there is a topic that you would like to request, we will give you an e-mail on the next slide that you can write into and we would love to hear from you. On the bottom right on chart 33, there is also some more information about the workshops that I was telling you about. This is a free workshop … it's two day … very hands on, you would get your hands dirty with the code … you will get to see demos … you will get to run your own code and really see for yourself the compatibility between PL/SQL and IBM’s DB2’s dialect of SQL. And there is also the certification opportunity, so for those of you who might be looking for career advancement or looking for new opportunity, definitely certification is the way to get started. Slide 34 And with that, I’d like to wrap it up. If you have questions or feedback about today’s session, we would love to hear from you. And Cindy Russell is really the person behind this, and I want to thank her for setting this up and she runs the DB2 practitioner program. She would be delighted to hear from you whether it is feedback on today's session, or topics that you would like to hear, or if you have - 14 - questions for follow up on things we talked about today, please feel free to email her at that address. With that, I would like to thank you all. © Copyright IBM Corporation 2011. IBM Software Group Route 100 Somers NY 10589 U.S.A. Produced in the United States of America May, 2011 All Rights Reserved. IBM, the IBM logo, ibm.com, and DB2 are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries or both. If these or other IBM trademarked terms are marked with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the web at “Copyright and trademark information” at ibm.com/legal/copytrade.shtml Oracle is a registered trademark of Oracle Corporation in the United States, other countries or both. Linux is a registered trademark of Linus Torvalds in the United States, other countries or both. UNIX is a registered trademark of The Open Group in the United States, other countries or both. Windows is a trademark of Microsoft Corporation in the United States, other countries or both. Other company, product or service names may be trademarks or service marks of others. - 15 -