With my recent website move I thought I better check for broken links, in case I had screwed anything up during the transfer. The last few times I’ve done this I’ve used SiteCrawl, which seems to do a decent job.
After the scan had finished I looked through the results and had a couple of broken internal links and 171 broken external links. Can you guess where the vast majority of the broken external links were pointing to? Yes, it was Oracle documentation. A quick search on my blog reveals about six rants I’ve posted about this over the years. There have been many more incidents of course.
Fixing this stuff is very time consuming and boring. What’s worse, it robs me of time I could spend on creating new content.
I’m guessing most content producers don’t go back and check for broken links. Oracle certainly don’t because I see them all the time in whitepapers, articles and forum entries. The result of this is a sea of helpful content produced by the community that are littered with broken links to Oracle content. It lowers the value of the community content, through no fault of the content producer.
Oracle. Please stop doing this! You are constantly devaluing the content produced by us in the community!
Update: You should probably read this by Kim Berg Hansen.Fixing broken links to Oracle documentation… Again… was first posted on May 9, 2016 at 9:59 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.
Azure uses the Amazon model for pricing in that it charges on a processor shape. Automated backup and the APEX interface/libraries are not included and requires a separate step to install and configure the software on top of the Oracle database. The default operating installation is Windows Server and you are billed monthly for this shape as well. Azure does not offer backup or any other part of a managed service but only provided a virtual image with the Oracle database pre-installed on a Windows Server in the cloud. The pricing for this software is ShapeCoresRAMDiskStandard EditionEnterprise EditionShape price A0 Basic or Standard0.250.75 GB19 GB$1.11/hr or $826/mo$3.16/hr or $2,351/mo0.02/hr or $15/mo A1 Basic or Standard11.75 GB224 GB$1.11/hr or $826/mo$3.16/hr or $2,351/mo0.09/hr or $67/mo A2 Basic or Standard23.5 GB489 GB$1.11/hr or $826/month$3.16/hr or $2,351/mo0.18/hr or $134/mo A5 Standard214 GB489 GB$1.11/hr or $826/month$3.16/hr or $2,351/mo0.33/hr or $246/mo A3 Basic or Standard47 GB999 GB$1.28/hr or $952/mo$6.32/hr or $4,702/mo0.036/hr or $268/mo A6 Standard4428 GB999 GB$1.28/hr or $952/mo$6.32/hr or $4,702/mo A4 Basic or Standard814 GB2,039 GB$2.55/hr or $1,897/mo$12.63/hr or $9,397/mo0.72/hr or $536/mo A7 Standard856 GB2,039 GB$2.55/hr or $1,897/mo$12.63/hr or $9,397/mo1.32/hr or $982/mo A8 Standard856 GB382 GB$2.55/hr or $1,897/mo$12.63/hr or $9,397/mo A10 Standard856 GB382 GB$2.55/hr or $1,897/mo$12.63/hr or $9,397/mo A9 Standard16112 GB382 GB$5.10/hr or $3794/mo$25.27/hr or $18,801/mo A11 Standard16112 GB382 GB$5.10/hr or $3794/mo$25.27/hr or $18,801/mo D1 Standard13.5 GB50 GB$1.11/hr or $826/mo$3.16/hr or $2,351/mo$0.14/hr or $104/mo D2 Standard27 GB100 GB$1.11/hr or $826/mo$3.16/hr or $2,351/mo$0.28/hr or $208/mo D11 Standard214 GB100 GB$1.11/hr or $826/mo$3.16/hr or $2,351/mo$0.33/hr or $246/mo D3 Standard414 GB200 GB$1.28/hr or $952/mo$6.32/hr or $4,702/mo$0.56/hr or $417/mo D12 Standard428 GB200 GB$1.28/hr or $952/mo$6.32/hr or $4,702/mo$0.652/hr or $485/mo D4 Standard828 GB400 GB$2.55/hr or $1,897/mo$12.63/hr or $9,397/mo$1.12/hr or $833/mo D14 Standard16112 GB800 GB$5.10/hr or $3,794/mo$25.27/hr or $18,801/mo$2.11/hr or $1571/mo
Note that this is a generic database pricing with compute shape pricing. It is not a Schema as a Service pricing. We still need to layer APEX on top of the generic database installation. This provides the cheapest option for deploying Schema as a Service as $841/month or $1.13/hr for Standard Edition or $2,366/month or $3.18/hr for Enterprise Edition. We can basically stop here. Running an Oracle database on a single core with less than 1 GB is unusable. Going to the A1 Basic or Standard shape increases the memory to 1.75 GB which might work for a single schema and the 70 GB of disk will store enough tables for most use cases.
Let's walk through creation of an Oracle database on Azure. First we go to Azure portal and search for the Oracle virtual machine by clicking on New.
We are looking for either the Enterprise Edition or Standard Edition of the database.
We select the Standard Edition and ask that it be provisioned.
Once we select the product, we select the shape that we will run this instance on. The shape recommended is relatively large so we have to look at all shapes and we select the A1 shape primarily for cost reasons. We should look at what we are trying to do and load the right core count and memory footprint.
We select the network and storage models for this instance. We go with the defaults in this example rather than adding another storage instance since we are just looking for a small footprint.
Some things to note here. First, we have the option of a username and password or ssh to connect to the operating system. Second, we are never presented with any information about the operating system. Based on the documentation that we read earlier, I assume that this would be Windows Server. It turns out that is actually Oracle Enterprise Linux 6.7. Third, we are never asked information about the database. We are not asked about the SID, password for sys, or ports to open up or use to connect to the database. It turns out that the database is installed in the /u01 directory but no database is created. You still need to run the dbca to create a database instance and start a listener. There are other OS options available to install the database on. We theoretically could have selected Windows 2012 but these options did not come up with our search.
It took a few minutes to start up the database virtual machine. We can look up the details on the instance to find the ip address to connect to and use putty or ssh to connect to the instance.
When we log in we can look at the installation directory and notice that everything is installed in /u01. When we look at the /etc/oratab we notice that nothing is configured or installed. We will need to run oratab to create a database instance. We will then need to download and install APEX to configure Schema as a Service.
In summary, we can install and run an Oracle database on Azure. The installation is lacking a bit. This is more of an Infrastructure as a Service with a binary pre-installed but not configured. This is not Database as a Service and lacking when we compare it to Schema as a Service. To get Schema as a Service we need to download software, install it, change the network configuration, and update the virtual machine network (we did not show this). Microsoft has a good tutorial on the steps needed to create the database once the virtual machine is installed. You do get root access to the operating system. You do get sys access to the database. You do get file system access through the operating system. The documentation says that the service is on Windows but got provisioned on Linux. We might have done something wrong looking back or the documentation is wrong. The service is priced for generic database starting at $800/month or more based on the shape you select. This installation is not DBaaS or PaaS. Backups are not automatically done for you. Patches are not configure or installed. You basically get an operating system with a binary configured. The database is not configured and ports are not configured to allow you to connect across the internet.
Recently I published a link to a summary report for the recent Oracle Utilities Customer Care and Billing 2.5 benchmark. Due to popular demand, we have released additional information about the benchmark including some configuration advice in a new additional whitepaper Oracle Utilities Customer Care and Billing V2.5 and 2.4 Comparison Benchmark Whitepaper (Doc Id: 2135359.1) now available from My Oracle Support.
This whitepaper was provided from our performance team and provides additional technical information about the benchmark setup as well as the results.
This week is the ICMI Contact Center Expo 2016 in Long Beach, CA. It is a leading customer service industry event. Will you be there?
If you're on the fence, here's a good reason to attend. Mazda's Yvonnne Burkhouse, IT director applications development (pictured left), is speaking. Mazda, a Japanese automaker, is an Oracle Service Cloud customer. Burkhouse will present "Driving Beyond Loyalty and Exceptional Customer Engagement Center Experiences at Mazda." The company is live on Oracle Service Cloud and has a winning contact center solution story with convincing results.
I woke up to find all my WordPress installations had automatically updated to WordPress 4.5.2 overnight.
It’s a security release, so if it’s not already auto-updated for you, you might want to log into your dashboard and give it a nudge.
- I moved to Singapore thinking it would be for a year or two at least and is now looking like it will be longer.
- I started my first role as a permanent employee of someone else's organisation for the first time in around 22 years. (This was the one that most blew away those who know me well!)
- I went back to being a Production DBA, having worked on general Oracle performance and development with dev teams over my 5 years of different contracts at UBS.
- Pretty soon, I was heading up the Level 3 Production DBA team and doing less and less technical work because of the long list of other things I had to take care of.
- I only did it out of necessity (it was what the organisation needed at the time) but I started finding myself enjoying those new things more than I expected.
My new job will be pretty wide-ranging and if I were to explain it completely it would not only be professionally inappropriate but I suspect aspects of it would convince people I've gone mad or am having a mid-life crisis! However, one aspect that actually drew me towards it and would send most of my peers running for the hills is that I will be a Situation Manager (one of a small team) on Major Incidents. In my most recent role, I found myself constantly helping with the incident management parts while the real DBAs did the work – each of us doing what we were best at. Imagine having that in your home at all hours of the day and night and yet Mads reaction was - 'you love doing that, don't you?'. Bless her
It turns out that I’m pretty OK at cajoling, organising and perhaps speaking slightly loudly at people during high pressure situations. Maybe it's growing up with an RAF Sergeant as a father or maybe, just maybe ...
I like to think of Mr. Wolf as the ultimate Situation Manager.
At least that model works well for me in *my* head
The Linux Pre-Install RPM for Oracle E-Business Suite Release 12.1 and 12.2 on Oracle Linux 7 is now available for use by customers.
The package (oracle-ebs-server-R12-preinstall) is available on the 'addons' channel on the Oracle Unbreakable Linux Network (ULN, which requires a support contract) or from the Oracle public yum repository. It is recommended that this x86_64 Pre-Install RPM be installed on top of a 'minimal' installation of Oracle Linux to provide just the necessary packages to install and run Oracle E-Business Suite Release 12.
As with the Oracle Database Pre-Install RPMs, this RPM provides all required E-Business Suite packages for both the application and database tiers, while configuring the systems with particular kernel and other values to meet requirements published in our Linux x86-64 Installation and Upgrade Notes (see below).
This Pre-Install RPM will significantly reduce the preparation time of an environment prior to installing, upgrading or cloning an E-Business Suite R12 instance by performing the following tasks:
- Downloading and installing all software package versions and dependencies required for installing E-Business Suite R12 (12.1, 12.2)
- Creating the users oracle and applmgr for use as owners of the database and application tiers respectively, while setting hard and soft shell resource limits
- Updating kernel parameters in /etc/sysctl.conf to recommended values
- Sets DNS resolver parameters in /etc/resolv.conf to minimum recommended values
- Sets 'numa=off' in the kernel in the kernel commandline
- Disables 'Transparent Huge Pages (THP)' if enabled
For more information and details, please review the following documents:
- Oracle E-Business Suite Installation and Upgrade Notes Release 12
(12.1.1) for Linux x86-64 (My Oracle Support Document 761566.1)
- Oracle E-Business Suite Installation and Upgrade Notes Release 12 (12.2) for Linux x86-64 (My Oracle Support Document 1330701.1)
NOTE: If you want to read and play with the interactive application, please go to the shinnyapps article. It has been developed using Shiny/R in order to allow you to see the effects of the algorithms.
Thanks to Valerie Parham-Thompson at Pythian and Daniel Prince at Oracle.
Github repository contains the code to generate and load the data and also, the Shiny/R code.Some initial thoughts
A couple of weeks ago one of our customers came up with a question regarding FTS over InnoDB engine. Although the question is not answered in the current article, I came up with the conclusion that FTS is sometimes misunderstood.
The point of this article is to show dynamically how the search algorithms work, using non-fictional data (data sources were downloaded from Gutenberg project within an easy interface (please see at the bottom of the ShinnyApps post here) .
In order to show the effects off the field sizes over the query expansion algorithm, you will see two main tables (bookContent and bookContentByLine) both containing the same books in different approaches: by line and by paragraph. You’ll see the noise generated by the
QUERY EXPANSION algorithm when phrases are too large.
For the sake of simplicity, in this article we won’t go through the FTS parsers. That is possible material for a future post.Why I consider FTS sometimes misunderstood?
FTS is a technology that can be use for any purpose, not only simple searches. Generally, FTS engines are placed to work as a service for web or document searches, which generally require technologies like Solr, ElasticSearch or Sphinx. However, certain bussines rules require complex searches, and having such feature inside RDBMS can be a win.
RDBMS aren’t a good place for massive amount of FTS queries, without using any of the join capabilities that they offer, or the ACID properties.
As I said above, FTS is totally acceptable in RDBMS, if you are using at least one RDBMS main feature, required by your bussines model.Action!
To start showing the effects of the algorithms, the following example searches the word ‘country’ using query expansion. This means that we are not looking only the exact matches, but also the entries that appear the most when the the exact match has been found.
SELECT clause you’ll see both FTS expressions using
NATURAL LANGUAGE with query expansion and
BOOLEAN modes respectively.
View the code on Gist.
The noise generated by the query expansion is expected and described in the official documentation here.
The interesting case is the following row, which has 2 exact occurrences (you can see the positions 1 and 63) and it is not the highest rank using query extension. Remember, this is expected.
Text: "country districts. As Lucca had five gates, he divided his own country"
This is even worse when using large sentences. In the example bellow you will see the same query, against the table storing by paragraph. The boolean rank shows some of the entries way above others, however the query extension locates at the top records that not necessarily has a lot of exact matches.
View the code on Gist.
The query expansion is useful when you intend to search which entries contain more words that appear frequently within the search term. Having large text fields increase the probability to have more words that appear among the search term. In the case of bookContent table (by paragraph table), the average field size is 443.1163 characters.The INNODB_FT_INDEX_TABLE
There is a way to play with the contents of the FTS indexes. As you may noticed in the previous examples, I used the
set global innodb_ft_aux_table = 'ftslab/bookContent'; statement, which loads the index content to memory for an easy querying.
If you use RDS, the option innodb_ft_aux_table is not available as it is GLOBAL and require SUPER privileges.
i.e. You can easily get the most frequent tokens:
View the code on Gist.
We can query the index contents with a simple SQL statement like the following:
View the code on Gist.
In the example shown before the is no intention to compare ranks score as they are based in different algorithms. The idea there is to show that QUERY EXPANSION can have non desire results in some cases due to its mechanism.Building custom stopwords
It probably isn’t very useful information as most of these words appears too frequently and are modal verbs, adverbs, pronouns, determiners, etc. It could be the case that you are not interested on indexing those words. If that’s the case you can add them as
stopwords in your own stopwords table. Specially if you are more interested in boolean searches, loosing some part of the language expressions.
We can build a custom stopwords table based on our current data:
View the code on Gist.
Let’s build our stopwords table using both default and new entries and keeping the alphabetical order:
View the code on Gist.
The idea behind choosing our own stopwords is to measure how much index do we safe filtering those words that are extremely frequent and don’t add a necessary meaning to the search. This topic could be covered in a separate blog post.Going ahead on choosing stop words
The full article is amazingly interesting. In brief, it says that the most frequent word will occur approximately twice as often as the second most frequent word, three times as often as the third most frequent word, and so on (rank-frequency distribution is an inverse relation).Considerations and recommendations
– Use QUERY EXPANSION only if you are interested in searching relations over exact matches. Remember that the field
size is crucial when using this.
– FTS is not the best fit for exact string matches in single columns. You don’t want to use FTS for searching emails in a single column, name and lastname fields , i.e. For those, you’ll probably use other techniques as reverse searches , exact match operator (=) or hashing (CRC32 for emails or large texts smaller than 255 characters).
– Keep your FTS indexes short. Do not add ALL the text columns. Parse first from your application the user search and adapt the query.
– If you are using BOOLEAN MODE, you can use the rank score to filter rows. MySQL is clever enough to optimize the
FTS functions to avoid double executions. You can do this using something like:
match(content,title) against ("first (third)") > 1 . Generally, scores lower than 1 can be ignored when using boolean or natural mode searches.
– `OPTIMIZE TABLE` does a rebuild of the table. To avoid this, set innodb_optimize_fulltext_only=1 in order to do an incremental maintance on the table.
– Recall that NATURAL LANGUAGE MODE does not take the operands as the BOOLEAN MODE. This affects the ranking score (try +bad (thing) i.e.)
– If you plan to order by rank, it is not necessary to specify the clause `ORDER BY` as InnoDB does the order after retrieve the doc ids . Also,the behavior is different from the default as it returns the heaviest at the top (like an ORDER BY rank DESC).
– If you come from MyISAM’s FTS implementation, recall that the ranking scoring is different.
– Create the FULLTEXT index after the data is loaded InnoDB Bulk Load. When restoring FTS backups, you will probably hit the “ERROR 182 (HY000) at line nn: Invalid InnoDB FTS Doc ID”.
– Try to avoid using use more than one FTS expression in the where clause. Keep in mind that this affects the order in the results and it consumes a considerably amount of CPU. InnoDB orders by the latest expression in the WHERE clause. WL#7123.
– Also, if avoiding the rank information in the projection (SELECT clause) and using other aggregations like count(*), will use the “no ranking” FT_hints. The
LIMIT hint won’t be used if invoked explicitly an
ORDER BY and the
MATCH clause in the projection.
View the code on Gist.
– If you plan to use FTS_DOC_ID column with AUTO_INCREMENT option, have in mind that there is a limitation regarding this. You must declare a single column PRIMARY KEY constraint or as an UNIQUE index. Also, the data type is stricted as `bigint unsigned`. i.e:
View the code on Gist.FT_QUERY_EXPANSION_LIMIT
This variable controls the number of top matches when using `WITH QUERY EXPANSION` (affects only MyISAM). Reference.Bug 80347 – Invalid InnoDB FTS Doc ID
emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ ./m dumpTest < full.dump
ERROR 182 (HY000) at line 73: Invalid InnoDB FTS Doc ID
emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ ./m dumpTest < ddl.dump
emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ ./m dumpTest < onlyData.dump
emanuel@3laptop ~/sandboxes/rsandbox_5_7_9 $ ./m dumpTest < full.dump
ERROR 182 (HY000) at line 73: Invalid InnoDB FTS Doc ID
mysqldump is not very clever if you use `FTS_DOC_ID`:
2016-02-13T22:11:53.125300Z 19 [ERROR] InnoDB: Doc ID 10002 is too big. Its difference with largest used Doc ID 1 cannot exceed or equal to 10000
It takes dumps without considering the restriction coded in `innobase/row/row0mysql.cc`:
Difference between Doc IDs are restricted within
4 bytes integer. See fts_get_encoded_len()
The fix to this is backuping the table by chunks of 10000 documents.Other useful links
Happy Friday! Did you miss our latest Virtual Technology Summit or want to explore a track you didn't before? If so, visit a Replay Library Group today! Tip - join the group and you'll be notified of updates automatically.
Last week several of my colleagues and myself had the privilege of attending the Samsung Developers Conference (SDC) in San Francisco. It was the 5th time Samsung organized a developers conference in San Francisco but only the first time I attended, although some in our party were present previous times so I had some idea of what to expect. Here are some impressions and thoughts on the conference.
After an hour walking around, my first thought was: is there anything that Samsung doesn’t have their hand in? I knew of course they produce smart phones, tablets, smart watches and TVs, I’ve seen a laptop here and there, but vacuum cleaners, air conditioning units and ranges? Semi-conductors (did you know that inside the iPhone there are Samsung chips?), Smart fridges and security cameras and now VR gear and IoT, pretty crazy. Interestingly enough, I think there are some distinct advantages that Samsung might have because of this smorgasbord of technology over more focused companies (like say Apple) , more on that later.
As with all of these events, Samsung’s motivation for organizing this conference is of course not entirely altruistic; as I mentioned in the intro, they have a huge hardware footprint and almost all of that needs software, which gets developed by … developers.
They need to attract outside developers to their platforms to make them interesting for potential buyers, I mean, what would the iPhone be without Apps? There is nothing wrong with that, that’s one of the reasons we have Oracle OpenWorld, but I thought that the sessions on the “Innovation Track” where a bit light on technical details (at least the ones I attended).
In fact, some of them wouldn’t have been misplaced in the “Marketing Track” I feel. To be fair, I didn’t get to attend any of the hands-on sessions on day zero, maybe they were more useful, but as a hard core developer, I felt a bit … underwhelmed by the sessions.
That doesn’t mean though that the sessions were not interesting, probably none more so than “How to Put Magic in a Magical Product” by Moe Tanabian, Chief Design Officer at Samsung, which took us on a “design and technical journey to build an endearing home robot”, basically how they created this fella:
That is Otto, a personal assistant robot, similar to the Amazon Echo, except with a personality. Tanabian explained in the session how they got from idea and concept to production using a process remarkably similar to how we develop here at the AppsLab; fail fast, iterate quickly, get it in front of user as quickly as possible, measure etc. I just wish we had the same hardware tooling available as they do (apparently they used, what I can only image are very expensive 3D printers to produce the end result).
Samsung also seems to be making a big push in the IoT space, and for good reason. The IoTivity project is a joint open source connectivity framework, sponsored by the Open Interconnect Consortium (OIC) of which Samsung is a member and one of the sessions I attended was about this project.
The whole Samsung Artic IoT platform supports this standard, which should make it easy and secure to discover and connect Artic modules to each other. The question as always is: will other vendors adopt this standard so that you can do this cross-vendor, i.e. have my esp8266’s talk to an Artic module which then talks to a Particle and my Philips Hue lights etc.
Without this, such a new standard is fairly useless and just adds to the confusion.
As mentioned in the intro though, because Samsung makes pretty much everything, they could start by enabling all their own “things” to talk to each other over the internet. Their smart fridge could then command their robotic vacuums to clean up the milk that just got spilled in the kitchen. The range could check what is in the fridge and suggest what’s for dinner. Artic modules can then be used as customizations and extensions for the few things that are not built by Samsung (like IoT Nerf Guns :-), all tied together by Otto which can relay information from and to the users.
This is an advantage they have over e.g. Google (with Brillo) or Apple (with HomeKit) who have to ask hardware vendors to implement their standard; Samsung has both hardware and the IoT platform, no need for an outside party, at least to get started.
Personally, I’m hoping that in the near future I get to experiment with some of the Artic modules, they look pretty cool!
And then of course there was VR; VR Gears, VR vendors, VR Cameras even a VR rollercoaster ride (which I tried and of course made my sick, same as with the Oculus Rift demo at UKOUG last year), maybe I’m just not cutout for VR. One of the giveaways was actually a Gear 360 camera which allows you to take 360 degree camera footage which you can then experience using the Gear VR, nicely tying up the whole Samsung VR experience.
All in all it was a great conference with cool technology showing off Samsung’s commitment to VR and IoT.
Oh, and I got to meet Flo Rida at an AMA session
What is Adaptive Query Optimization?
Adaptive Query Optimization is a term for a set of new features in Oracle 12c to allow the optimizer to improve the accuracy of execution plans. It is described in the Optimizer with Oracle 12c white paper. Put simply, Oracle collects additional statistics at parse which it can use to generate a better execution plan than if just using the ordinarily collected statistics, and further statistics at execution time which can be used to decide to change the next execution of the same SQL statement.
Why does it cause a Problem in PeopleSoft?Adaptive Optimization is clearly a very useful feature in an application where SQL is shareable and reused. But there's the rub. This is another Oracle feature that works best with an application that is written the way that Oracle advise applications be written (Automatic Memory Management and the plan stability/management technologies also come to mind), and PeopleSoft is not such an application. PeopleSoft applications are SQL parse intensive.Parsing SQL is an expensive business in terms of time and CPU. The library cache preserves the results of previous parses to save the overhead of repeatedly parsing the same SQL for subsequent executions. If the SQL is different, even just by a literal value, it will have a different SQL_ID and will not be matched. If SQL statements in the library cache are not being reused, then you incur the cost of parse for each execution. Adaptive Optimization adds to that overhead, in PeopleSoft sometimes to the point where it can have a significant effect on performance.PeopleSoft has several behaviours which lead to different SQL statements being generated from the same source:
- Application Engine %BIND() variables are resolved to literals in the generated SQL if the ReUseStatement attribute is not set, which by default it is not. Thus, each iteration of a loop may cause the same Application Engine step to produce a different SQL statement with a different SQL ID (see Performance Benefits of ReUse Statement Flag in Application Engine). PeopleSoft development have got better at using this attribute in delivered code. However, there are still many places where it could be set but is not.
- There are many places in PeopleSoft where SQL is generated dynamically to cater for dynamic application configuration options. Bind variables may be embedded in the SQL as literals, or more significant changes may be introduced such as completely different criteria or joining different tables. In Application Engine, ReUseStatement usually cannot be used in conjunction with such steps.
- Different concurrently executing instances of the same Application Engine programs will be allocated a different non-shared instance of a temporary record, so they reference different tables.
- "optimizer_adaptive_features = FALSE - After upgrading to Oracle Database 12c, many PeopleSoft customers have noticed overall performance degradation which is related to Oracle 12c Optimizer Adaptive Feature (OAF). It is recommended that this value be disabled in all Oracle Databases running PeopleSoft Applications."
- It has always been tempting to enable cursor_sharing with PeopleSoft so that Oracle converts literals in the SQL back to bind variables and thus matching SQLs in the library cache with different literal values. However, it has no benefit for some dynamic SQL and different temporary table instances. It can also introduce other optimizer problems. On the few occasions that I have tried it, I have never had a good experience. This document also recommends against it.
- Poor SQL execution performance (where a poor plan is selected)
- Poor Parse performance (where the optimizer takes more time to determine the optimal access path)
- The scope of either of the issues may not be limited to individual statements but can impact the whole system in unforeseen ways, for example both poor SQL execution and parse performance may cause locks to be held for a greater duration than normal causing lock contention while poor parse performance may hold latches or mutexes to similar results.
- Blanket disabling of features: While disabling Adaptive Features with blanket parameters (see: Document 2031605.1) may provide a short term workaround, the loss of their benefits to other queries may be significant. If the underlying cause is some other factor then a better solution is to identify and address that"
or to disable individual sub-features
OPTIMIZER_ADAPATIVE_FEATURES=FALSE /*disables adaptive optimisation as a whole*/
_optimizer_adaptive_plans=false /*disables adaptive plans*/
_optimizer_use_feedback=false /*disables the use of statistics feedback for subsequent executions. Default is true*/
_px_adaptive_dist_method=off /*disables the adaptive parallel distribution methods*/
_optimizer_dsdir_usage_control=0 /* disables the optimizer usage of dynamic statistics (sampling) directives. Default value is 126 */
_optimizer_gather_feedback=false /*disables the gathering of execution feedback in the optimizer*/
_optimizer_nlj_hj_adaptive_join=false /*disables only the adaptive join from nested loops to hash join*/
There is no question that it is important to maintain accurate statistics and histograms on tables in PeopleSoft. I have written elsewhere about how to achieve that, especially on temporary tables. However, I am left with a concern that disabling the entire feature may be excessive, and that there may be areas in PeopleSoft where it can bring some benefits. I would like to find the opportunity to test whether it is possible to achieve better performance by only disabling certain parts of the feature.
Amazon offers a variety of database options as a managed service. This is available through the Amazon RDS Console. This screen looks like
If you go to the RDS Getting Started page you will see that you can provision
- SQL Server
- Maria DB
- or an Amazon custom database, Aurora
We won't go into a compare and contrast in this blog entry to compare the different database but go into the Oracle RDS offerings and look at what you get that compares to Schema as a Service offered by Oracle.
The Oracle database offerings that you get from Amazon RDS are
- Oracle Standard Edition One
- Oracle Standard Edition Two
- Oracle Standard Edition
- Oracle Enterprise Edition
You can launch 11g or 12c version of the Oracle database but it is important to note that through RDS you do not get access to the operating system, file system, or sys/system user. There is an elevated user that lets you perform a limited function list but not all options are available to this elevated user. Some features are also not enabled in the 12c instance
- multi-tenant or pluggable databases
- Real Application Clusters (RAC)
- Data Guard / Active Data Guard
- Connection to Enterprise Manager
- Automated Storage Management
- Database Vault
- Java libraries
- Locator services
- Label Security
- Real Application Testing
- XML DB
- Alter database
- Alter system
- Create any directory
- Drop any directory
- Grant any privilege
- Grant any role
Before we dive into the usage of Amazon RDS, let's talk pricing and licensing. The only option that you have for a license included with RDS is the Standard Edition One license type. To figure out the cost, we must look at the sizes that we can provision as well as the RDS cost calculator. To start this journey, we start at the AWS console, go to the RDS console, and select Oracle SE1 as the instance type.
If we select the license-included License Model we get to look at the shapes that we can deploy as well as the versions of the database.
We can use the cost calculator in conjunction to figure out the monthly cost of deploying this service. For our example we selected 184.108.40.206 v7, db.t2.micro (1 vCPU, 1 GB RAM), and 20 GB of storage. For this shape we find that the monthly cost will be $25.62. We selected the 220.127.116.11 version because this is the only 11g option available to us for the SE1 licensed included selection. We could have selected the 18.104.22.168 as an option. If we select any other version we must bring our own license to run on AWS. It is important to look at the outbound transfer rate because this cost is some times significant. If we put 20 GB outbound traffic the price increases to $26.07 which is not significant. This says that we can backup our entire database once a month offsite and not have to pay a significant to get our database off RDS.
It is important to look at the shape options that we have for the different database versions. We should also look at the cost associated with it. For 11g we have
- db.t2.micro (1 vCPU, 1 GB) - $25.62/month
- db.t2.small (1 vCPU, 2 GB) - $51.24/month
- db.t2.medium (2 vCPU, 4 GB) - $102.48/month
- db.t2.large (2 vCPU, 8 GB)- $205.70/month
- db.m4.large (2 vCPU, 8 GB)- $300.86/month
- db.m4.xlarge (4 vCPU, 16 GB)- $602.44/month
- db.m4.2xlarge (8 vCPU, 32 GB)- $1324.56/month
- db.m4.4xlarge (16 vCPU, 64 GB) - $2649.11/month
- db.m3.medium (1 vCPU, 3.75 GB) - $153.72/month
- db.m3.large (2 vCPU, 7.5 GB) - $307.44/month
- db.m3.xlarge (4 vCPU, 15 GB) - $614.88/month
- db.m3.2xlarge (8 vCPU, 30 GB) - $1352.74/month
- db.r3.large (2 vCPU, 15 GB) - $333.06/month
- db.r3.xlarge (4 vCPU, 30 GB) - $666.12/month
- db.r3.2xlarge (8 vCPU, 61 GB) - $1465.47/month
- db.r3.4xlarge (16 vCPU, 122 GB) - $2930.93/month
- db.m2.xlarge (2 vCPU, 17 GB) - $409.92/month
- db.m2.2xlarge (4 vCPU, 34 GB) - $819.84/month
- db.m2.4xlarge (8 vCPU, 68 GB) - $1803.65/month
- db.m1.small (1 vCPU, 3.75 GB) - $84.18/month
- db.m1.medium (2 vCPU, 7.5 GB) - $168.36/month
- db.m1.large (4 vCPU, 15 GB) - $336.72/month
- db.m1.xlarge (8 vCPU, 30 GB) - $673.44/month
If we want to create the database, we can select the database version (11g), the processor size (smallest just to be cheap for demo purposes), and storage. We define the OID, username and password for the elevated user, and click next
We then confirm the selections and backup schedule (scroll down to see), and click on Launch.
When we launch this, the system shows that the inputs were accepted and the database will be created. We can check on the status by going to the RDS console.
It takes a few minutes to provision the database instance, 15 minutes in our test. When the creation is finished we see available rather than creating for the status.
Once the instance is created we can connect to the database using the Oracle Connection Instructions and connect using sqlplus installed on a local machine connecting to a remote database (the one we just created), using the aws connection tools to get status (aws rds describe-db-instances --headers), or connecting with sql developer to the ip address, port 1521, and user oracle with the password we specified. We chose to open up port 1521 to the internet during the install which is not necessarily best practices.
Note that we have fallen short of Schema as a Service. We have database as a service at this point. We will need to layer application express on top of this to get Schema as a Service. We can install APEX 4.1.1 on the 11g instance that we just created by following installation instructions. Note that this is a four step process followed by spinning up and EC2 instance and installing optional software to run a listener because the APEX listener is not supported on the RDS instance. We basically add $15-$20/month to spin up a minimal EC2 instance and install the listener software and follow the nine step installation process to link the listener to the RDS instance.
The installation and configuration steps are similar for 12c. We can provision a 12c instance of the database in RDS, spin up an EC2 instance for the listener, and configure the EC2 instance to point to the RDS instance. At this point we have the same experience that we have as Schema as a Service with the Oracle DBaaS option.
In summary, we can provision a database into the Amazon RDS. If we want anything other than Standard Edition One, we need to bring our own license at $47.5K/two cores for Enterprise Edition or $17.5K for Standard Edition and maintain our annual support cost at 22%. If we want to use a license provided by Amazon we can but are limited to the Standard Edition One version and APEX 4.1.1 with 22.214.171.124 or APEX 4.2.6 with 126.96.36.199. Both of these APEX versions are a major version behind the 5.0 version offered by Oracle through DBaaS. On the positive side, we can get a SE One instance with APEX installed for about $50/month for 11g or $100/month for 12c which is slightly cheaper than the $175/month through Oracle. The Oracle product is Enterprise Edition vs Standard Edition One on AWS RDS and the Oracle version does come with APEX 5.0 as well as being configured for you upon provisioning as opposed to having to perform 18 steps and spin up an EC2 instance to act as a listener. It really is difficult to compare the two products as the same product but if you are truly only interested in a simple query engine schema as a service in the cloud, RDS might be an option. If you read the Amazon literature, switching to Aurora is a better option but that is a discussion for another day.
- Partner Webcast - Collaboration in a digital world with Oracle Docume…
via Blogs.Oracle.com/IMC - Slideshows by User: oracle_imc_team http://ift.tt/gEA7C8
- Partner Webcast - Oracle BI cloud service: Insights at Your Fingertips (Oracle Partner Hub: ISV Migration Center Team)
via Oracle Partner Hub: ISV Migration Center Team http://ift.tt/1AAiVSD
We share our skills to maximize your revenue!
The developers complained that their test database was so much slower than production that they could not use it to really test whether their batch processes would run fast enough when migrated to production. They did not give me any particular queries to check. Instead they said that the system was generally too slow. So, I went through a process to find SQL statements that they had run in test and that normally run in production and compare their run times. I thought that I would document the process that I went through here.
First I found the top 100 queries by elapsed time on both the test and production databases using this query:
column FORCE_MATCHING_SIGNATURE format 99999999999999999999 select FORCE_MATCHING_SIGNATURE from (select FORCE_MATCHING_SIGNATURE, sum(ELAPSED_TIME_DELTA) total_elapsed from DBA_HIST_SQLSTAT where FORCE_MATCHING_SIGNATURE is not null and FORCE_MATCHING_SIGNATURE <>0 group by FORCE_MATCHING_SIGNATURE order by total_elapsed desc) where rownum < 101;
The output looked like this:
FORCE_MATCHING_SIGNATURE ------------------------ 944718698451269965 4634961225655610267 15939251529124125793 15437049687902878835 2879196232471320459 12776764566159396624 14067042856362022182 ...
Then I found the signatures that were in common between the two lists.
insert into test_sigs values (944718698451269965); insert into test_sigs values (4634961225655610267); insert into test_sigs values (15939251529124125793); ... insert into prod_sigs values (3898230136794347827); insert into prod_sigs values (944718698451269965); insert into prod_sigs values (11160330134321800286); ... select * from test_sigs intersect select * from prod_sigs;
This led to 32 values of FORCE_MATCHING_SIGNATURE which represented queries that ran on both test and production, except for the possible difference in constants.
Next I looked at the overall performance of these 32 queries in test and production using this query:
create table common_sigs (FORCE_MATCHING_SIGNATURE number); insert into common_sigs values (575231776450247964); insert into common_sigs values (944718698451269965); insert into common_sigs values (1037345866341698119); ... select sum(executions_delta) total_executions, sum(ELAPSED_TIME_DELTA)/(sum(executions_delta)*1000), sum(CPU_TIME_DELTA)/(sum(executions_delta)*1000), sum(IOWAIT_DELTA)/(sum(executions_delta)*1000), sum(CLWAIT_DELTA)/(sum(executions_delta)*1000), sum(APWAIT_DELTA)/(sum(executions_delta)*1000), sum(CCWAIT_DELTA)/(sum(executions_delta)*1000), sum(BUFFER_GETS_DELTA)/sum(executions_delta), sum(DISK_READS_DELTA)/sum(executions_delta), sum(ROWS_PROCESSED_DELTA)/sum(executions_delta) from DBA_HIST_SQLSTAT ss,common_sigs cs where ss.FORCE_MATCHING_SIGNATURE = cs.FORCE_MATCHING_SIGNATURE;
Here is part of the output:
TOTAL_EXECUTIONS Elapsed Average ms CPU Average ms IO Average ms ---------------- ------------------ -------------- ------------- 5595295 366.185529 241.92785 59.8682797 430763 1273.75822 364.258421 1479.83294
The top line is production and the bottom is test.
This result supported the development team’s assertion that test was slower than production. The 32 queries averaged about 3.5 times longer run times in test than in production. Also, the time spent on I/O was about 25 times worse. I am not sure why the I/O time exceeded the elapsed time on test. I guess it has something to do with how Oracle measures I/O time. But clearly on average these 32 queries are much slower on test and I/O time probably caused most of the run time difference.
After noticing this big difference between test and production I decided to get these same sorts of performance metrics for each signature to see if certain ones were worse than others. The query looked like this:
select ss.FORCE_MATCHING_SIGNATURE, sum(executions_delta) total_executions, sum(ELAPSED_TIME_DELTA)/(sum(executions_delta)*1000), sum(CPU_TIME_DELTA)/(sum(executions_delta)*1000), sum(IOWAIT_DELTA)/(sum(executions_delta)*1000), sum(CLWAIT_DELTA)/(sum(executions_delta)*1000), sum(APWAIT_DELTA)/(sum(executions_delta)*1000), sum(CCWAIT_DELTA)/(sum(executions_delta)*1000), sum(BUFFER_GETS_DELTA)/sum(executions_delta), sum(DISK_READS_DELTA)/sum(executions_delta), sum(ROWS_PROCESSED_DELTA)/sum(executions_delta) from DBA_HIST_SQLSTAT ss,common_sigs cs where ss.FORCE_MATCHING_SIGNATURE = cs.FORCE_MATCHING_SIGNATURE having sum(executions_delta) > 0 group by ss.FORCE_MATCHING_SIGNATURE order by ss.FORCE_MATCHING_SIGNATURE;
I put together the outputs from running this query on test and production and lined the result up like this:
FORCE_MATCHING_SIGNATURE PROD Average ms TEST Average ms ------------------------ ------------------ ------------------ 575231776450247964 20268.6719 16659.4585 944718698451269965 727534.558 3456111.6 * 1037345866341698119 6640.87641 8859.53518 1080231657361448615 3611.37698 4823.62857 2879196232471320459 95723.5569 739287.601 * 2895012443099075884 687272.949 724081.946 3371400666194280661 1532797.66 761762.181 4156520416999188213 109238.997 213658.722 4634693999459450255 4923.8897 4720.16455 5447362809447709021 2875.37308 2659.5754 5698160695928381586 17139.6304 16559.1932 6260911340920427003 290069.674 421058.874 * 7412302135920006997 20039.0452 18951.6357 7723300319489155163 18045.9756 19573.4784 9153380962342466451 1661586.53 1530076.01 9196714121881881832 5.48003488 5.13169472 9347242065129163091 4360835.92 4581093.93 11140980711532357629 3042320.88 5048356.99 11160330134321800286 6868746.78 6160556.38 12212345436143033196 5189.7972 5031.30811 12776764566159396624 139150.231 614207.784 * 12936428121692179551 3563.64537 3436.59365 13637202277555795727 7360.0632 6410.02772 14067042856362022182 859.732015 771.041714 14256464986207527479 51.4042938 48.9237251 14707568089762185958 627.586095 414.14762 15001584593434987669 1287629.02 1122151.35 15437049687902878835 96014.9782 996974.876 * 16425440090840528197 48013.8912 50799.6184 16778386062441486289 29459.0089 26845.8327 17620933630628481201 51199.0511 111785.525 * 18410003796880256802 581563.611 602866.609
I put an asterisk (*) beside the six queries that were much worse on test than production. I decided to focus on these six to get to the bottom of the reason between the difference. Note that many of the 32 queries ran about the same on test as prod so it really isn’t the case that everything was slow on test.
Now that I had identified the 6 queries I wanted to look at what they were spending their time on including both CPU and wait events. I used the following query to use ASH to get a profile of the time spent by these queries on both databases:
select case SESSION_STATE when 'WAITING' then event else SESSION_STATE end TIME_CATEGORY, (count(*)*10) seconds from DBA_HIST_ACTIVE_SESS_HISTORY where FORCE_MATCHING_SIGNATURE in ('944718698451269965', '2879196232471320459', '6260911340920427003', '12776764566159396624', '15437049687902878835', '17620933630628481201') group by SESSION_STATE,EVENT order by seconds desc;
The profile looked like this in test:
TIME_CATEGORY SECONDS ------------------------ ------- db file parallel read 207450 ON CPU 141010 db file sequential read 62990 direct path read 36980 direct path read temp 29240 direct path write temp 23110
The profile looked like this in production:
TIME_CATEGORY SECONDS ------------------------ ------- ON CPU 433260 PX qref latch 64200 db file parallel read 35730 db file sequential read 14360 direct path read 12750 direct path write temp 12000
So, I/O waits dominate the time on test but not production. Since db file parallel read and db file sequential read were the top I/O waits for these 6 queries I used ash to see which of the 6 spent the most time on these waits.
db file parallel read:
select 2 sql_id, 3 (count(*)*10) seconds 4 from DBA_HIST_ACTIVE_SESS_HISTORY 5 where 6 FORCE_MATCHING_SIGNATURE in 7 ('944718698451269965', 8 '2879196232471320459', 9 '6260911340920427003', 10 '12776764566159396624', 11 '15437049687902878835', 12 '17620933630628481201') and 13 event='db file parallel read' 14 group by sql_id 15 order by seconds desc; SQL_ID SECONDS ------------- ---------- ak2wk2sjwnd34 159020 95b6t1sp7y40y 37030 brkfcwv1mqsas 11370 7rdc79drfp28a 30
db file sequential read:
select 2 sql_id, 3 (count(*)*10) seconds 4 from DBA_HIST_ACTIVE_SESS_HISTORY 5 where 6 FORCE_MATCHING_SIGNATURE in 7 ('944718698451269965', 8 '2879196232471320459', 9 '6260911340920427003', 10 '12776764566159396624', 11 '15437049687902878835', 12 '17620933630628481201') and 13 event='db file sequential read' 14 group by sql_id 15 order by seconds desc; SQL_ID SECONDS ------------- ---------- 95b6t1sp7y40y 26840 ak2wk2sjwnd34 22550 6h0km9j5bp69t 13300 brkfcwv1mqsas 170 7rdc79drfp28a 130
Two queries stood out at the top waiters on these two events: 95b6t1sp7y40y and ak2wk2sjwnd34. Then I just ran my normal sqlstat query for both sql_ids for both test and production to find out when they last ran. Here is what the query looks like for ak2wk2sjwnd34:
select ss.sql_id, ss.plan_hash_value, sn.END_INTERVAL_TIME, ss.executions_delta, ELAPSED_TIME_DELTA/(executions_delta*1000) "Elapsed Average ms", CPU_TIME_DELTA/(executions_delta*1000) "CPU Average ms", IOWAIT_DELTA/(executions_delta*1000) "IO Average ms", CLWAIT_DELTA/(executions_delta*1000) "Cluster Average ms", APWAIT_DELTA/(executions_delta*1000) "Application Average ms", CCWAIT_DELTA/(executions_delta*1000) "Concurrency Average ms", BUFFER_GETS_DELTA/executions_delta "Average buffer gets", DISK_READS_DELTA/executions_delta "Average disk reads", ROWS_PROCESSED_DELTA/executions_delta "Average rows processed" from DBA_HIST_SQLSTAT ss,DBA_HIST_SNAPSHOT sn where ss.sql_id = 'ak2wk2sjwnd34' and ss.snap_id=sn.snap_id and executions_delta > 0 and ss.INSTANCE_NUMBER=sn.INSTANCE_NUMBER order by ss.snap_id,ss.sql_id;
I found two time periods where both of these queries were recently run on both test and production and got an AWR report for each time period to compare them.
Here are a couple of pieces of the AWR report for the test database:
Here are similar pieces for the production database:
What really stood out to me was that the wait events were so different. In production the db file parallel read waits averaged around 1 millisecond and the db file sequential reads averaged under 1 ms. On test they were 26 and 5 milliseconds, respectively. The elapsed times for sql_ids 95b6t1sp7y40y and ak2wk2sjwnd34 were considerably longer in test.
This is as far as my investigation went. I know that the slowdown is most pronounced on the two queries and I know that their I/O waits correspond to the two wait events. I am still trying to find a way to bring the I/O times down on our test database so that it more closely matches production. But at least I have a more narrow focus with the two top queries and the two wait events.
Since documentation is not very clear to me on the topic of InnoDB flushing in combination with Linux IO (specifically the write system call), I decided to put together this article in hopes of shedding some light on the matter.How Linux does I/O
By default, the write() system call returns after all data has been copied from the user space file descriptor into the kernel space buffers. There is no guarantee that data has actually reached the physical storage.
The fsync() call is our friend here. This will block and return only after the data and metadata (e.g. file size, last update time) is completely transferred to the actual physical storage.
There is also fdatasync() which only guarantees the data portion will be transferred, so it should be faster.
There are a few options that we can specify at file open time, that modify the behaviour of write():O_SYNC
In this case, the write() system call will still write data to kernel space buffers, but it will block until the data is actually transferred from the kernel space buffers to the physical storage. There is no need to call fsync() after.O_DIRECT
This completely bypasses any kernel space buffers, but requires that the writes are the same size as the underlying storage block size (usually 512 bytes or 4k). By itself, it does not guarantee that the data is completely transferred to the device when the call returns.O_SYNC + O_DIRECT
As stated above, we would need to use both options together guarantee true synchronous IO.Relation with InnoDB flushing
Innodb_flush_method parameter controls which options will be used by MySQL when writing to files:
At the time of this writing, we have the following options:NULL
This is the default value, and is equivalent to specifying fsync option.fsync
Both data and redo log files will be opened without any special options, and fsync() will be used when the db needs to make sure the data is transferred to the underlying storage.O_DSYNC
This one is confusing, as O_DSYNC us actually replaced with O_SYNC within the source code before calling open(). It is mentioned this is due to some problems on certain Unix versions. So O_SYNC will be used to open the log files, and no special options for the datafiles. This means fsync() needs to be used to flush the data files only.O_DIRECT
Data files are opened with O_DIRECT. Log files are opened with no extra options. Some filesystems (e.g. XFS) do not guarantee metadata without the fsync() call, so it is still used as safety measure.O_DIRECT_NO_FSYNC
InnoDB uses O_DIRECT during flushing I/O, but skips the fsync() system call afterwards. This can provide some performance benefits if you are using a filesystem that does not require the fsync() to sync metadata.
I am deliberately not mentioning the experimental options littlesync and nosync.
There is also an extra option in Percona Server:ALL_O_DIRECT
It uses O_DIRECT to open the log files and data files and uses fsync() to flush both the data and the log files.Which InnoDB flushing method should I use?
The general consensus if you have a battery backed write cache or fast IO subsystem (e.g. SSD’s) is to use the O_DIRECT method. However it is a better practice to run tests to determine which method provides a better performance for each particular environment.
One downside of using O_DIRECT is that it requires the innodb-buffer-pool-size to be configured correctly. For example, if you accidentally leave your buffer pool size at the default value of 128M, but have 16G of RAM, the buffer pool contents will at least sit in the filesystem cache. This will not be true if you have O_DIRECT enabled (I would like to thank Morgan Tocker for his contribution regarding this section of the post).
This Log Buffer Edition takes into account blog posts from Oracle, SQL Server and MySQL.
Enterprise Manager Support Files 101- The EMOMS files
From time to time we see a complaint on OTN about the stats history tables being the largest objects in the SYSAUX tablespace and growing very quickly.
Delphix replication and push button cloud migration
PS360: A Utility to Extract and Present PeopleSoft Configuration and Performance Data
Contemplating Upgrading to OBIEE 12c?
Modifying the SQL Server Model System Database to Customize New Database Settings
Azure SQL Database Elastic Database Jobs
SQL Server Resource Governor
Add a Custom Index in Master Data Services 2016
Unified Approach to Generating Documentation for PowerShell Cmdlets
Writing SQL that works on PostgreSQL, MySQL and SQLite
MariaDB MaxScale 1.4.2 GA is available for download
MariaDB ColumnStore, a new beginning
Planets9s – Watch the replay: Become a MongoDB DBA (if you’re re really a MySQL user)
Upgrading to MySQL 5.7, focusing on temporal types
My defined index was on (OWNER, OBJECT_TYPE, OBJECT_NAME) --- defined as being from the column with the fewest distinct values to the most. This ordering is best compressible with Index Key Compression (also known as Prefix Compression). If I reverse the order, Index Key Compression for the two leading columns wouldn't deliver the same level of compression. The question is whether Advanced Index Compression can intelligently handle the reversal.
SQL> create index target_data_ndx_3_comp on
2 target_data(object_name, object_type, owner) compress 2;
SQL> exec dbms_stats.gather_index_stats('','TARGET_DATA_NDX_3_COMP');
PL/SQL procedure successfully completed.
SQL> select leaf_blocks
2 from user_indexes
3 where index_name = 'TARGET_DATA_NDX_3_COMP'
Surprisingly, this index with Prefix 2 on (OBJECT_NAME, OBJECT_TYPE) is, at 3,091 leaf blocks, smaller than the previous index with Prefix 2 on (OWNER, OBJECT_TYPE) at 5,508 leaf blocks.
Continuing with Prefix 3
SQL> drop index target_data_ndx_3_comp;
SQL> create index target_data_ndx_3_comp on
2 target_data(object_name, object_type, owner) compress 3;
SQL> exec dbms_stats.gather_index_stats('','TARGET_DATA_NDX_3_COMP');
PL/SQL procedure successfully completed.
SQL> select leaf_blocks
2 from user_indexes
3 where index_name = 'TARGET_DATA_NDX_3_COMP'
At 2,277 leaf blocks it is, as expected, the same size with Prefix 3 on (OWNER, OBJECT_TYPE, OBJECT_NAME). Since the entire index key is specified as the Prefix, both indexes would be the same size.
Going on to Advanced Index Compression
SQL> drop index target_data_ndx_3_comp;
SQL> create index target_data_ndx_4_advcomp on
2 target_data(object_name, object_type, owner)
3 compress advanced low
SQL> exec dbms_stats.gather_index_stats('','TARGET_DATA_NDX_4_ADVCOMP');
PL/SQL procedure successfully completed.
SQL> select leaf_blocks
2 from user_indexes
3 where index_name = 'TARGET_DATA_NDX_4_ADVCOMP'
This is, again, as expected. Advanced Index Compression results in the same size irrespective of the ordering of the columns.
The advantage of Advanced Index Compression over Key or Prefix Compression is that the DBA does not need to determine the Prefix for compression. He does not have to spend time to analyze the data and compare the number of distinct values for each of the columns in the composite index.
MySQL uses metadata locking to manage concurrent access to database objects, and to ensure data consistency when performing modifications to the schema: DDL operations. Metadata locking applies not just to tables, but also to schemas and stored programs (procedures, functions, triggers, and scheduled events).
In this post I am going to cover metadata locks on tables and triggers, that are usually seen by DBAs during regular operations/maintenance.
Kindly refer to these 4 different connections to MySQL Instance:
The screenshot shows that the uncommitted transaction may cause metadata lock to ALTER operations. The ALTER will not proceed until the transaction is committed or rolled-back. What is worse, after the ALTER is issued, any queries to that table (even simple SELECT queries) will be blocked. If the ALTER operation is an ONLINE DDL operation available in 5.6+, queries will proceed as soon as the ALTER begins.
Refer to this video tutorial on MySQL Metadata Locks for further context.
These days we have a “DBAs” favourite tool “pt-online-schema-change” (osc). Let’s have a look what will happen If we run osc instead of ALTER.
OSC gets stuck at metadata lock at the point of creating triggers on table.
Let’s jump on the second topic how can we mitigate MDL issues:Mitigating the MetaData Lock Issues
There are various solutions to tackling MDL:
- Appropriate setting of wait_timeout variable which will kill stuck/sleep threads after a certain time.
- Configure pt-kill to get rid of stuck/sleep threads
- Fix code where transactions are not committed after performing DB queries
If you are on RDS and your MySQL is having bunch of Sleep threads and you don’t know which connection is causing metadata lock, then you have to kill all the Sleep queries which are in mysql for more than a certain time. As we know “kill thread_id” is not permitted in RDS, but you can use the query below to get the exact queries to kill Sleep threads.
mysql> SELECT CONCAT('CALL mysql.rds_kil ( ',id,')',';') FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND='Sleep' AND TIME > 10 ; +---------------------------------------------+ | CONCAT('CALL mysql.rds_kill ( ',id,')',';') | +---------------------------------------------+ | CALL mysql.rds_kill ( 5740758); | | CALL mysql.rds_kill ( 5740802); | | CALL mysql.rds_kill ( 5740745); | | CALL mysql.rds_kill ( 5740612); | | CALL mysql.rds_kill ( 5740824); | | CALL mysql.rds_kill ( 5740636); | | CALL mysql.rds_kill ( 5740793); | | CALL mysql.rds_kill ( 5740825); | | CALL mysql.rds_kill ( 5740796); | | CALL mysql.rds_kill ( 5740794); | | CALL mysql.rds_kill ( 5740759); | | CALL mysql.rds_kill ( 5740678); | | CALL mysql.rds_kill ( 5740688); | | CALL mysql.rds_kill ( 5740817); | | CALL mysql.rds_kill ( 5740735); | | CALL mysql.rds_kill ( 5740818); | | CALL mysql.rds_kill ( 5740831); | | CALL mysql.rds_kill ( 5740795); | | CALL mysql.rds_kill ( 4926163); | | CALL mysql.rds_kill ( 5740742); | | CALL mysql.rds_kill ( 5740797); | | CALL mysql.rds_kill ( 5740832); | | CALL mysql.rds_kill ( 5740751); | | CALL mysql.rds_kill ( 5740760); | | CALL mysql.rds_kill ( 5740752); | | CALL mysql.rds_kill ( 5740833); | | CALL mysql.rds_kill ( 5740753); | | CALL mysql.rds_kill ( 5740722); | | CALL mysql.rds_kill ( 5740723); | | CALL mysql.rds_kill ( 5740724); | | CALL mysql.rds_kill ( 5740772); | | CALL mysql.rds_kill ( 5740743); | | CALL mysql.rds_kill ( 5740744); | | CALL mysql.rds_kill ( 5740823); | | CALL mysql.rds_kill ( 5740761); | | CALL mysql.rds_kill ( 5740828); | | CALL mysql.rds_kill ( 5740762); | | CALL mysql.rds_kill ( 5740763); | | CALL mysql.rds_kill ( 5740764); | | CALL mysql.rds_kill ( 5740773); | | CALL mysql.rds_kill ( 5740769); | | CALL mysql.rds_kill ( 5740770); | | CALL mysql.rds_kill ( 5740771); | | CALL mysql.rds_kill ( 5740774); | | CALL mysql.rds_kill ( 5740784); | | CALL mysql.rds_kill ( 5740789); | | CALL mysql.rds_kill ( 5740790); | | CALL mysql.rds_kill ( 5740791); | | CALL mysql.rds_kill ( 5740799); | | CALL mysql.rds_kill ( 5740800); | | CALL mysql.rds_kill ( 5740801); | | CALL mysql.rds_kill ( 5740587); | | CALL mysql.rds_kill ( 5740660); | +---------------------------------------------+ 53 rows in set (0.02 sec)
- Capture sql queries to kill Sleep threads
mysql -htest-server.us-west-2.rds.amazonaws.com. –skip-column-names -e ‘SELECT CONCAT(“CALL mysql.rds_kill ( “,id,”)”,”;”) FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND=”Sleep” AND TIME > 10’ > kill_sleep_threads.sql
2.Execute queries from mysql prompt
mysql -htest-server.us-west-2.rds.amazonaws.com. mysql> source kill_sleep_threads.sqlImprovements in MySQL 5.7 related to MDL
Generally, we would want to kill as few connections as possible. But the trouble with metadata locks prior to 5.7 is that there is no insight available into which threads are taking the metadata lock. In MySQL 5.7, there are several improvements in getting insight into metadata lock information.
“The Performance Schema now exposes metadata lock information:
- Locks that have been granted (shows which sessions own which current metadata locks)
- Locks that have been requested but not yet granted (shows which sessions are waiting for which metadata locks).
- Lock requests that have been killed by the deadlock detector or timed out and are waiting for the requesting session’s lock request to be discarded
This information enables you to understand metadata lock dependencies between sessions. You can see not only which lock a session is waiting for, but which session currently holds that lock.
The Performance Schema now also exposes table lock information that shows which table handles the server has open, how they are locked, and by which sessions.”
To check who holds the metadata lock in MySQL 5.7, We have to enable global_instrumentation and wait/lock/metadata/sql/mdl.
Below is the example to enable global_instrumentation and wait/lock/metadata/sql/mdl
mysql> UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME = 'global_instrumentation'; Query OK, 0 rows affected (0.00 sec) Rows matched: 1 Changed: 0 Warnings: 0 mysql> UPDATE performance_schema.setup_instruments SET ENABLED = 'YES' WHERE NAME = 'wait/lock/metadata/sql/mdl'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0
Once global_instrumentation and wait/lock/metadata/sql/mdl are enable, below query will show the locks status on connections.
mysql> SELECT OBJECT_TYPE, OBJECT_SCHEMA, OBJECT_NAME, LOCK_TYPE, LOCK_STATUS, THREAD_ID, PROCESSLIST_ID, PROCESSLIST_INFO FROM performance_schema.metadata_locks INNER JOIN performance_schema.threads ON THREAD_ID = OWNER_THREAD_ID WHERE PROCESSLIST_ID <> CONNECTION_ID(); +-------------+---------------+-------------+---------------------+-------------+-----------+----------------+------------------------------------------+ | OBJECT_TYPE | OBJECT_SCHEMA | OBJECT_NAME | LOCK_TYPE | LOCK_STATUS | THREAD_ID | PROCESSLIST_ID | PROCESSLIST_INFO | +-------------+---------------+-------------+---------------------+-------------+-----------+----------------+------------------------------------------+ | TABLE | sbtest | sbtest1 | SHARED_READ | GRANTED | 29 | 4 | NULL | | GLOBAL | NULL | NULL | INTENTION_EXCLUSIVE | GRANTED | 30 | 5 | alter table sbtest1 add key idx_pad(pad) | | SCHEMA | sbtest | NULL | INTENTION_EXCLUSIVE | GRANTED | 30 | 5 | alter table sbtest1 add key idx_pad(pad) | | TABLE | sbtest | sbtest1 | SHARED_UPGRADABLE | GRANTED | 30 | 5 | alter table sbtest1 add key idx_pad(pad) | | TABLE | sbtest | sbtest1 | EXCLUSIVE | PENDING | 30 | 5 | alter table sbtest1 add key idx_pad(pad) | | TABLE | sbtest | sbtest1 | SHARED_READ | PENDING | 31 | 6 | select count(*) from sbtest1 | +-------------+---------------+-------------+---------------------+-------------+-----------+----------------+------------------------------------------+ 6 rows in set (0.00 sec)
Here PROCESSLIST_ID 4 is GRANTED and other PROCESSLIST_IDs are in PENDING state.Conclusion
Best-practice when running any DDL operation, even with performance schema changes in 5.7, it to make sure to check processlist for presence of MDL waits, check SHOW ENGINE INNODB STATUS for long active transactions. Kill DDL operation while resolving the MDL issue so as to prevent query pileup. For a temporary fix implement pt-kill or wait_timeout. Review and fix application code/scripts for any uncommitted transactions to solve metadata lock issue.
- 10 MB storage - free
- 25 MB storage - free
- 5 GB - $175/month
- 20 GB - $900/month
- 50 GB - $2000/month
We covered this topic from a different angle a month ago in convertign excel to apex and printing from apex. Both of these blog entries talk about how to use Schema as a Service to solve a problem. Some good references on Schema as a Service can be found atsafari books subscription to get these books on demand and on my iPad for reading on an airplane.
When we login to access the database we are asked to present the schema that we created, a username, and a password. Note in this example we are either using the external free service apex.oracle.com or the Oracle corporate service for employees apex.oraclecorp.com. The two services are exactly the same. As an Oracle employee I do not have access to the public free service and am encouraged to use the internal service for employees. The user interface is the same but screen shots will bounce between the two as we document how to do things.
Once we login we see a main menu system that allows us to manage application, manage tables in the database, to team development, and download and install customer applications.
The Object Browser allows us to look at the data in the database. The SQL Commands allow us to make queries into the database. The SQL Scripts allows us to load and save sql commands to run against the database. Utilities allows us to load and unload data. The REST ful service allows us to define html interfaces into the database. If we look at the Object Browser, we can look at table definitions and data stored in tables.
We can use the SQL Commands tab to execute select statements against a table. For example, if we want to look at part number B77077 we can select it from the pricelist by matching the column part_number. We should get back one entry since there is only one part for this part number.
If we search for part number B77473 we get back multiple entries that are the same part number. This search returns six lines of data with more data in other columns than the previous select statement.
The SQL Scripts allows you to load scripts to execute from your laptop or desktop. You can take queries that have run against other servers and run them against this server.
Up to this point we have looked at how to write queries, run queries, and execute queries. We need to look at how to load data so that we have something to query against. This is done in the Utilities section of the Schema as a Service. Typically we start with a data source either as an XML source or an Excel spreadsheet. We will look first at taking an Excel spreadsheet like the one below and importing it into a table.
Note that the spreadsheet is well defined and headers exist for the data. We do have some comments at the top that we need to delete so that the first row becomes the column names in our new table. Step one is to save the sheet as a comma separated value file. Step two is to edit this file and delete the comment and blank like. Step three is to upload the file into the Schema as a Service web site.
At this point we have a data source loaded and ready to drop into a table. The user interface allows us to define the column type and tries to figure out if everything is character strings, numbers, or dates. The tools is good at the import but typically fails at character length and throws exceptions on specific rows. If this happens you can either manually enter the data or re-import the data into an existing table. Doing this can potentially cause replication of data so deleting the new table and re-importing into a new table might be a good thing. You have to play at this point to import your data and get the right column definitions to import all of your data.
Alternatively we can import xml data into an existing table. This is the same process of how we backup our data by exporting it as xml.
At this point we have loaded data using a spreadsheet or csv file and an xml file. We can query the database by entering sql commands or loading sql scripts. We could load data from a sql script if we wanted but larger amounts of data needs to be imported with a file. Unfortunately, we can not take a table from an on-premise database or an rman backup and restore into this database. We can't unplug a pdb and plug it into this instance. This service does have limitations but for free or less than $200 per month, the service provides a lot of functionality.
To create an application to read and display our data, we must create an application. To do this we go into the application development interface of Schema as a Service.
We select the Application Builder tab at the top and click on Create icon. This takes us to a selection of what type of application to build. We are going to build a desktop application since it has the most function options. We could have easily selected the mobile option which formats displays for a smaller screen format.
We have to select a name for our application. In this example we are calling it Sample_DB since we are just going to query our database and display contents of a table.
We are going to select one page. Note that we can create multiple pages to display different information or views into a table. In our previous blog entry on translating excel to apex we created a page to display the cost of archive and the cost of storage on different pages. In this example we are going to create one page and one page only.
If we have shared components from other apps or want to create libraries that can called from other apps, we have the option at this point to define that. We are not going to do this but just create a basic application to query a database.
We can create a variety of authorization sources to protect our data. In this example we are going to allow anyone to read and write our table. We select no authorization. We could use an external source or look at the user table in the database to authenticate users. For this example, we will leave everything open for our application.
We get a final confirmation screen (not shown) and create the application. When the create is finished we see the following screen that lets us either run the application or edit it.
If we click on the Home icon we can edit the page. This screen is a little daunting. There are to many choices and things that you can do. There are different areas, breadcrumbs for page to page navigation, and items that you can drop into a screen. In this example we are going to add a region by hovering the mouse over the Content Body and right clicking the mouse. This allows us to create a new region in the body of our page.
Note that a new region is created and is highlighted in the editor. We are going to edit the content type. We have a wide variety of options. We could type in static text and this basically becomes a static web page. Note that we could create a graph or chart. We could create a classic report. We could create a form to submit data and query a table. We will use the interactive report because it allows us to enter sql for a query into our table.
In this example we will enter the select statement in the query box. We could pop this box into another window for full screen editing. For our example we are doing a simple select * into our table with select * from pricelist.
When we click the run button at the top right we execute this code and display it in a new window. We can sort this data, we can change the query and click Go. This is an interactive from to read data from our table. If we wanted to restrict the user from reading all of the data we would have selected a standard report rather than an interactive report.
The final part of our tutorial is creation of a REST api for our data. We would like to be able to go to a web page and display the data in the table. For example, if we want to look at the description of part number B77077 it would be nice to do it from a web page or get command at the command line. To do this we go to the SQL Workshop tab and click the RESTful Service icon at the top right.
Again, this screen is a little daunting. We get a blank screen with a create button. Clicking the create button takes to a screen where we need to enter information that might not be very familiar.
The screen we see is asking us to enter a name for our service, a template, and a resource handler. Looking at this for the first time, I am clueless as to what this means. Fortunately, there is an example on how to enter this information if you scroll down and click on the Example button.
If we look at the example we see that the service name is the header that we will hit from our web page.
In our example we are going to create a cloud RESTapi where we expose the pricelist. In this example we call the service cloud. We call the resource template pricelist and allow the user to pass in a part_number to query. In the resource handler we go a get function that does a select from the table. We could pass in the part number that we want to read but for simplicity we ignore the part number and return all rows in the table. Once we click save, we have exposed our table to a web query with no authentication.
Once we have created our REST service we can query the database from a we browser using the url of the apex server/pls/apex/(schema name)/pricelist/(part number). In this example we go to apex.oraclecorp.com/pls/apex/parterncloud/pricelist/B77077. It executes the select statement and returns all rows in the table using JSON format.
In summary, we are able to upload, query, and display datatbase data using http and https protocols. We can upload data in xml or csv format. We can query the database using web based tools or REST interfaces. We can display data either by developing a web based program to display data or pull the data from a REST interface and get the data in JSON format. This service is free if your database size is small. If we have a larger database we can pay for the service as well as host the application to read the data. We have the option to read the data from a REST interface and pull it into an application server at a different location. We did not look at uploading data with a PUT interface through the REST service but we could have done this as well. Up next, how do we implement this same service in AWS or Azure.