Skip navigation.

Feed aggregator

Alfresco: some useful database queries

Yann Neuhaus - Sun, 2015-05-10 04:50


In my previous post, I talked about the Lifecycle of Alfresco Nodes. You may have noticed that I tried to insert in my explanations some elements that are specific to databases (tables, fields, aso...). These elements are quite essential to prepare a post like this one: more database oriented. I already explained what exactly are the consequences on the database side when a node is removed and I will try in this post to share some useful queries regarding these points but not only!


For this post, I used my local Alfresco Community 4.2.c installation with a PostgreSQL database. For your information, it just take 30 minutes to get this test environment ready with the Alfresco's installer (Windows, Mac or Unix). Of course, use the Database only for your daily administration work is certainly not the best idea but in some cases, it can really be faster and easier to just run some SQL commands at the DB level...


I. Document information


So let start this post with some generic queries that can be used to retrieve some information about documents. In this part, all columns of the results will be the same because I just pick up the same fields in my queries but the filter part (the WHERE clause) changes a little bit to be able to retrieve some information from different elements.

The first command I would like to show you is how to retrieve some information about documents based on the size of the content. Here, I just uploaded the document "Test_Lifecycle.docx" with a size of 52MB. So based on that, let's say that I want to retrieve all elements on my Alfresco installation with a content that is bigger than 40MB. In the same approach, you can select all elements with a content that is smaller than or between XX and YYMB. The conversion in MB is done using the round() function. Therefore, if you want this value to be in KB instead, just remove one division by 1024 in each round() function:

All documents bigger than 40MB
SELECT n.id AS "Node ID",
       n.store_id AS "Store ID",
       round(u.content_size/1024/1024,2) AS "Size (MB)",
       n.uuid AS "Document ID (UUID)",
       n.audit_creator AS "Creator",
       n.audit_created AS "Creation Date",
       n.audit_modifier AS "Modifier",
       n.audit_modified AS "Modification Date",
       p1.string_value AS "Document Name",
       u.content_url AS "Location"
FROM alf_node AS n,
     alf_node_properties AS p,
     alf_node_properties AS p1,
     alf_namespace AS ns,
     alf_qname AS q,
     alf_content_data AS d,
     alf_content_url AS u
WHERE n.id=p.node_id
  AND ns.id=q.ns_id
  AND p.qname_id=q.id
  AND p.long_value=d.id
  AND d.content_url_id=u.id
  AND p1.node_id=n.id
  AND p1.qname_id IN (SELECT id FROM alf_qname WHERE local_name='name')
  AND round(u.content_size/1024/1024,2)>40
ORDER BY u.content_size DESC;



I will just put it once but here is the result of this command in this case:

 Node ID | Store ID | Size (MB) |          Document ID (UUID)          | Creator |         Creation Date         | Modifier |       Modification Date       |    Document Name    |                            Location                            
---------+----------+-----------+--------------------------------------+---------+-------------------------------+----------+-------------------------------+---------------------+----------------------------------------------------------------
131856 | 6 | 52.00 | eb267742-c018-4ba5-8ca4-75ca23c860f0 | Morgan | 2015-04-30T12:05:50.613+02:00 | Morgan | 2015-04-30T12:05:50.613+02:00 | Test_Lifecycle.docx | store://2015/4/30/12/5/0e111f05-7fcf-4a44-b719-b94cd04dd5ab.bin


So why did I selected these fields?!

  • Node ID: can be useful to join different tables
  • Store ID: a Store ID of 6 means that your document is in its active life. A Store ID of 5 means that this document has been deleted by a user and is now in the global trashcan
  • Size (MB): what we are searching for...
  • Document ID (UUID): the unique identifier of this document. The simplest way to preview this document is just to open the following url in any browser: http://HOSTNAME:PORT/share/page/document-details?nodeRef=workspace://SpacesStore/eb267742-c018-4ba5-8ca4-75ca23c860f0 (workspace://SpacesStore for store_id=6)
  • Creator, Modifier, Dates: well...
  • Document Name: can be useful to know the type of document without opening an URL (file extension)
  • Location: the actual location of the content's file on the File System. The "store://" refers to $ALF_DATA/contentstore/


The second command I would like to show you is how to retrieve some information based on the actual UUID of a document. As explained above, the UUID of a document can be found in the URL of its detail's page:

A document using its UUID
SELECT n.id AS "Node ID",
       n.store_id AS "Store ID",
       round(u.content_size/1024/1024,2) AS "Size (MB)",
       n.uuid AS "Document ID (UUID)",
       n.audit_creator AS "Creator",
       n.audit_created AS "Creation Date",
       n.audit_modifier AS "Modifier",
       n.audit_modified AS "Modification Date",
       p1.string_value AS "Document Name",
       u.content_url AS "Location"
FROM alf_node AS n,
     alf_node_properties AS p,
     alf_node_properties AS p1,
     alf_namespace AS ns,
     alf_qname AS q,
     alf_content_data AS d,
     alf_content_url AS u
WHERE n.id=p.node_id
  AND ns.id=q.ns_id
  AND p.qname_id=q.id
  AND p.long_value=d.id
  AND d.content_url_id=u.id
  AND p1.node_id=n.id
  AND p1.qname_id IN (SELECT id FROM alf_qname WHERE local_name='name')
  AND n.uuid='eb267742-c018-4ba5-8ca4-75ca23c860f0';



Another possible command would be to find some information based on the File System location. That can be useful for example if there is a big document on the File System and you want to know the type of this document with the extension, its name or maybe some other information about the creator/modifier:

A document using its path on the File System
SELECT n.id AS "Node ID",
       n.store_id AS "Store ID",
       round(u.content_size/1024/1024,2) AS "Size (MB)",
       n.uuid AS "Document ID (UUID)",
       n.audit_creator AS "Creator",
       n.audit_created AS "Creation Date",
       n.audit_modifier AS "Modifier",
       n.audit_modified AS "Modification Date",
       p1.string_value AS "Document Name",
       u.content_url AS "Location"
FROM alf_node AS n,
     alf_node_properties AS p,
     alf_node_properties AS p1,
     alf_namespace AS ns,
     alf_qname AS q,
     alf_content_data AS d,
     alf_content_url AS u
WHERE n.id=p.node_id
  AND ns.id=q.ns_id
  AND p.qname_id=q.id
  AND p.long_value=d.id
  AND d.content_url_id=u.id
  AND p1.node_id=n.id
  AND p1.qname_id IN (SELECT id FROM alf_qname WHERE local_name='name')
  AND u.content_url='store://2015/4/30/12/5/0e111f05-7fcf-4a44-b719-b94cd04dd5ab.bin';



II. Number of...


From a reporting point of view, let's say that you need some information regarding the number of... something. In this case and if you want to use your DB directly, then there is a really simple solution (simple but is it the best?) because Alfresco provide a Database architecture that is quite simple to understand and to use to get what you need. Indeed, if you take a look at the "alf_qname" table, you will see that every element that is part of Alfresco has its QName listed here. A QName is the Qualified Name of a repository item. This can be seen as a kind of "Super-Type":

alfresco=> SELECT * FROM alf_qname;
 id | version | ns_id | local_name
----+---------+-------+------------
  1 |       0 |     1 | store_root
  2 |       0 |     1 | aspect_root
  3 |       0 |     1 | container
  4 |       0 |     1 | children
  5 |       0 |     2 | user
...
 24 |       0 |     6 | folder
...
 51 |       0 |     6 | content
...
133 |       0 |     6 | thumbnail
134 |       0 |    13 | rendition
...


As you can see above, if you are searching for something that has a content, it can be done quite easily using the id or the local_name that correspond to that. So based on this table, here are some queries that can be useful:

Retrieve the number of users in the Repository
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q
WHERE n.type_qname_id=q.id
  AND q.local_name='user';

 

Retrieve the number of elements with a content in the Repository (include system's documents)
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q
WHERE n.type_qname_id=q.id
  AND q.local_name='content';

 

Retrieve the number of thumbnails in the Repository
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q
WHERE n.type_qname_id=q.id
  AND q.local_name='thumbnail';

 

Retrieve the number of renditions in the Repository
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q
WHERE n.type_qname_id=q.id
  AND q.local_name='rendition';



Of course you can do that for all QNames but you can also be more precise! So based on the query to retrieve the number of elements with a content, if you only want the number of documents of a specific type, then you can simply complete your query:

Retrieve the number of XML documents in the Repository
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q,
     alf_node_properties AS p
WHERE n.type_qname_id=q.id
  AND p.node_id=n.id
  AND p.qname_id IN
    (SELECT id
     FROM alf_qname
     WHERE local_name='name')
  AND q.local_name='content'
  AND p.string_value LIKE '%.xml';

 

Retrieve the number of PDF documents in the Repository
SELECT count(*)
FROM alf_node AS n,
     alf_qname AS q,
     alf_node_properties AS p
WHERE n.type_qname_id=q.id
  AND p.node_id=n.id
  AND p.qname_id IN
    (SELECT id
     FROM alf_qname
     WHERE local_name='name')
  AND q.local_name='content'
  AND p.string_value LIKE '%.pdf';


As the creation date, creator, modification date and modifier information are also stored on the "alf_node" table, you can also very easily filter your query based on the creation/update date of an Alfresco Node. That's pretty cool, right?! ;)


III. Lifecycle specific


To complete the relation between this blog post and the previous one, I wanted to share some queries that can be used to identify the current state of a document. As explained in my previous post, a document that is not yet deleted will be in the store named "workspace://SpacesStore". A document that has been deleted by a user will be in the sotre named "archive://SpacesStore" and when this document is removed from the global trashcan, the orphan_time is set to the current timestamp. With all these information and with the "alf_node" and "alf_content_url" tables we can easily build our own queries to find what is needed.

alfresco=> SELECT * FROM alf_store;
 id | version | protocol  |       identifier        | root_node_id
----+---------+-----------+-------------------------+--------------
  1 |       1 | user      | alfrescoUserStore       |            1
  2 |       1 | system    | system                  |            5
  3 |       1 | workspace | lightWeightVersionStore |            9
  4 |       1 | workspace | version2Store           |           10
  5 |       1 | archive   | SpacesStore             |           11
  6 |       1 | workspace | SpacesStore             |           12
(6 rows)


So let's find all documents that have been created and aren't deleted yet:

All documents created in their active life
SELECT *
FROM alf_node
WHERE store_id=6
  AND type_qname_id=51;


The next step on the lifecycle is when the documents have been deleted by a user but aren't deleted from the global trashcan (orphan_time is still NULL):

All documents created that are in the global trashcan (deleted by users)
SELECT *
FROM alf_node
WHERE store_id=5
  AND type_qname_id=51;


Finally, when the documents are removed from the global trashcan, some references/fields are removed, the QName of these documents change from "content" (51) to "deleted" (140) on the "alf_node" table and the orphan_time is set to the current timestamp on the "alf_content_url" table:

All elements that have been removed from the global trashcan and that are now orphaned
SELECT *
FROM alf_content_url
WHERE orphan_time IS NOT NULL;



I hope you enjoyed this blog post because it was quite hard for me to write something about database queries without giving up my soul to the DB world! See you soon ;).


Variations on 1M rows insert (1): bulk insert

Yann Neuhaus - Fri, 2015-05-08 09:19

I think you already have read the interesting series of blog posts of my colleagues Franck and Daniel about inserting on 1 million rows for Oracle and PostGreSQL. So it's time to write the first of the same series concerning SQL Server. First of all, just to clarify, the idea is not to make a direct comparison between Oracle, PostGreSQL and SQL Server but just to see variations that exist for each of them to insert quickly 1 million rows.

So I will perform the same basic test that my colleagues with roughly the same environment, one virtual machine on Hyper-V including only one processor, 512MB of memory and one SQL Server 2014 instance enterprise edition capped to 512 MB of memory:

 

Get-WmiObject –Class Win32_processor | ft Manufacturer, Name, NumberOfCores, NumberOfLogicalProcessors –Autosize

 

blog_42_-_1_-cpu_config

 

SELECT @@VERSION

 

blog_42_-_2_-_sql_version

 

SELECT        name,        value_in_use FROM sys.configurations WHERE name = 'max server memory (MB)'

 

blog_42_-_3_-_sql_config_mem

 

Row-by-row method

 

Let's start by using the same test tables with one heap table, clustered table and the same kind of script as well. I just modified the original script written by Franck but translating PL-SQL in T-SQL implies often using a completely different syntax but anyway, we will produce roughly the same bunch of data.

My user objects are stored to an user database called DEMO for this first test:

 

if object_id('DEMO', 'U') is not null        drop table DEMO;   create table DEMO("id" int , "text" varchar(15), "number" int);   if object_id('DEMO_PK', 'U') is not null        drop table DEMO_PK;   create table DEMO_PK("id" int , "text" varchar(15), "number" int,                  constraint demo_pk_pk primary key (id) );

...

DECLARE @i INT = 1; WHILE @i &lt= 1000000
BEGIN        INSERT INTO DEMO VALUES (@i, CASE ROUND(RAND() * 10, 0) WHEN 1 THEN 'Marc' WHEN 2 THEN 'Bill' WHEN 3 THEN 'George' WHEN 4 THEN 'Eliot' WHEN 5 THEN 'Matt' WHEN 6 THEN 'Trey' ELSE 'Tracy' END, RAND() * 10000);        SET @i += 1; END

 

Here my first result for both tables:

-- 00:02:29 – Heap table

-- 00:02:25 – table with clustered index

 

There are no big differences between inserting data into a heap table and a clustered table in this scenario because we insert rows basically in the same manner (always in the last page). At this point it is important to keep in mind that by default SQL Server uses implicit transaction mode. It means that each insert statement represents a transaction which has to be commited to the transaction log.

If we take a look at the specific wait statistics we can expect that the most waits will concern the log writes activity.

 

blog_42_-_4_-_waitstats

 

That’s it! The average values are pretty low but our results are far away from those of my colleagues. Let's motivated and let's talk about a kind of workaroud to speed-up the insert query. In fact, putting the user objects on tempdb database might be a kind of workaround. The main drawback is that tempdb database is temporary by design. Thus, our user objects will be persisted until the restart of the SQL Server but let's perform the previous test on tempdb. 

Here the results I get from this test:

-- 00:00:16 – Heap table

-- 00:00:15 – table with clustered index


 

So, a big improvement here.  Furthermore we may notice that related wait statistics have also changed as follows:

 

blog_42_-_5_-_waitstats


 

This main wait type is just related to a sustained CPU usage … so our final result is not so bad. At this point we may wonder why putting user objects on tempdb database increases the global procedure? In fact, we're using the special logging mechanism used by tempdb database that includes the lazy commit feature and the nonlogged after image feature for insert and update statements.


Go back to the user database DEMO and let's finish this row-by-row section by inserting data in an single transaction (or explicit mode) and let's take a look at the following results:

 

begin transaction   declare @i int = 1;     while @i <= 1000000
begin               insert DEMO values (@i, case cast(rand() * 10 as tinyint) when 1 then 'Marc' when 2 then 'Bill' when 3 then 'George' when 4 then 'Eliot' when 5 then 'Matt' when 6 then 'Trey'                                                                 when 7 then 'Tracy' when 8 then 'Greg' when 9 then 'Steve' else 'Patricia' end, rand() * 1000)          set @i = @i + 1; end   commit transaction

 

-- 00:00:10 – Heap table

-- 00:00:09 – table with clustered index

 

As excepted, we may notice a drastic drop of the duration value of both tests.

 

Bulk-insert method

Row-by-row commit is not the strength of SQL Server becauseeach commit requires to flush data to the transaction log. So let’s switch to bulk insert mode now. There are several ways to bulk insert data with SQL Server (either from client or server side by using for instance bcp or SSIS tool, BULK INSERT, SELECT INTO or OPENROWSET command and so on). In this test, I will use bcp to export data to a file before importing this file to my two tables.

To export my data I used the bcp command as follows:

 

blog_42_-_6_-_bcp_heap

 

...

 

blog_42_-_6_-_bcp_heap

 

I used native data types (-n option) in this case because my test concerns only transferring data from and to SQL Server. This option can improve performance but to be honest with this bunch of data the difference is not relevant.

Let’s bulk import our data to the two tables DEMO and DEMO_PK in my user database (not tempdb this time). At this point I want to be sure to be more efficient and I will use minimal logging for bulk-import operations in order to reduce the possibility to fill the log space and the potential contention (as a reminder writing to the transaction log file is always synchronous by default). Moreover, don’t forget that in this mode writing to the data file switches from asynchronous to synchronous mode. So becareful about your storage performance to avoid facing some unexpected behaviours during your import process.

So for my tests, the database DEMO is configured to SIMPLE recovery model and I will use BUKL INSERT command with TABLOCK option (which is a requirement to use minimally logging). Using options is possible but after some testing they appear to be not helpful in this context.

Concerning my heap table:

 

bulk insert demo.dbo.DEMO from 'C:bcpDEMO.dat' with (        DATAFILETYPE = 'native',        TABLOCK )

 

SQL Server Execution Times:

   CPU time = 703 ms, elapsed time = 725 ms.

 

Concerning my clustered table:

 

bulk insert demo.dbo.DEMO_PK from 'C:bcpDEMO_PK.dat' with (        DATAFILETYPE = 'native',        ORDER (id ASC),        TABLOCK )

 

SQL Server Execution Times:

   CPU time = 1437 ms, elapsed time = 1489 ms.

 

A little bit higher execution time than bulk import to a heap table. My table with a clustered index seems to introduce some overheads.

 

The bottom line is pretty the same than my colleagues. Insert and committing data rows by rows is not an optimized way if you plan to import a lot of data. So let’s continue on the same way than my colleagues with the next post of this series until the famous In-Memory feature. Don't forget that if you are in Switzerland in June, our experts from Oracle, Microsoft and SAP technologies will talk about In-Memory as implemented by SQL Server, Oracle and SAP HANA. All the same day. It's free and you can register now: Event In-Memory: boost your IT performance

Rebuild index on increasing values after deletes?

Yann Neuhaus - Fri, 2015-05-08 00:39

Yesterday while giving our Oracle tuning workshop I discussed with the customer about a job they have that rebuilds indexes every Sunday. Except in very rare cases Oracle indexes are maintained so that free space is reused by further inserts. But an index is an ordered structure. When we insert from a sequence, the value is always increasing, and go at the end of the index. And when we delete old data we delete index entries at the beginning of the index.
Is this a case where we need to manage it ourselves?

Test case

As usual I reproduce the issue. Here is my DEMO table with a DEMOPK index on the primary key:

SQL> create table DEMO (id number constraint DEMOPK primary key);
Table created.

I insert 10000 rows:

SQL> begin
  2   for i in 1..1e4 loop
  3    insert into DEMO values(i);
  4    commit;
  5   end loop;
  6  end;
  7  /
PL/SQL procedure successfully completed.

gather and check the stats:

SQL> exec dbms_stats.gather_table_stats(user,'DEMO');

PL/SQL procedure successfully completed.

SQL> select blocks,blevel,leaf_blocks from user_indexes join user_tables using(table_name) where index_name='DEMOPK';

    BLOCKS     BLEVEL LEAF_BLOCKS
---------- ---------- -----------
        20          1          18

So I have 1 branch and 18 leaf blocks.

 

Fragmentation

I'll check fragmentation from a user point of view. Having too much free space in leaf blocks is a problem with index range scan only. So let's fo an index range scan from the beginning to the end of the index:

SQL> alter session set statistics_level=all;
Session altered.

SQL> select /*+ index(DEMO) */ count(*) from DEMO;

  COUNT(*)
----------
     10000

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  7b6qc9m1cw3zd, child number 0
-------------------------------------
select /*+ index(DEMO) */ count(*) from DEMO

Plan hash value: 3019291478

-------------------------------------------------------------------------------------
| Id  | Operation        | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |      1 |        |      1 |00:00:00.01 |      19 |
|   1 |  SORT AGGREGATE  |        |      1 |      1 |      1 |00:00:00.01 |      19 |
|   2 |   INDEX FULL SCAN| DEMOPK |      1 |  10000 |  10000 |00:00:00.01 |      19 |
-------------------------------------------------------------------------------------

Exactly what we expected: 19 blocks reads is 1 branch and 18 leaves.

 

I have a script that does the same - range scan on an index - and shows how many index entries we have in each block. The script is here: How to measure Oracle index fragmentation. Let's run it on my index, with a bucket size large enough to see all blocks:

SQL> @ index_fragmentation.sql

        ID ->         ID rows/block bytes/block %free space     blocks free
---------- -- ---------- ---------- ----------- ----------- ---------- -----
         1 ->          1        578        8566                      1
       579 ->        579        571        8559                      1
      1150 ->       1150        571        8559                      1
      1721 ->       1721        571        8560                      1
      2292 ->       2292        571        8559                      1
      2863 ->       2863        571        8559                      1
      3434 ->       3434        571        8559                      1
      4005 ->       4005        571        8560                      1
      4576 ->       4576        571        8559                      1
      5147 ->       5147        571        8559                      1
      5718 ->       5718        571        8560                      1
      6289 ->       6289        571        8559                      1
      6860 ->       6860        571        8559                      1
      7431 ->       7431        571        8559                      1
      8002 ->       8002        571        8560                      1
      8573 ->       8573        571        8559                      1
      9144 ->       9144        571        8559                      1
      9715 ->       9715        286        4287          47          1 oo

Here are our 18 leaf blocks, covering values from 1 to 10000 (the ID displayed is the first one in each bucket - blocks here). The blocks are full (size is an approximation so this is why it's a bit higher than 8k), with about 570 entries per block. This is expected because when we insert increasing values, the block split fills the block instead of doing a 50-50 split.

 

delete insert lifecycle

Here is what I want to reproduce: delete old rows at the beginning of the index and insert new rows at the end. I'll do that for the same number of rows:10000 so I'm sure I've delete rows from all those 18 leaf blocks.

SQL> begin
  2   for i in 1..1e4 loop
  3    delete from DEMO where id=i;
  4    commit;
  5    insert into DEMO values(i+1e4);
  6    commit;
  7   end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.

Then run my index range scan:

SQL> select /*+ index(DEMO) */ count(*) from DEMO;

  COUNT(*)
----------
     10000

SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  7b6qc9m1cw3zd, child number 0
-------------------------------------
select /*+ index(DEMO) */ count(*) from DEMO

Plan hash value: 3019291478

-------------------------------------------------------------------------------------
| Id  | Operation        | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |        |      1 |        |      1 |00:00:00.01 |      24 |
|   1 |  SORT AGGREGATE  |        |      1 |      1 |      1 |00:00:00.01 |      24 |
|   2 |   INDEX FULL SCAN| DEMOPK |      1 |  10000 |  10000 |00:00:00.01 |      24 |
-------------------------------------------------------------------------------------

Did I double the number of blocks to read? No.

 

Do you think that we lost some space because we read 24 blocks instead of 19? Look at the numbers. The new numbers are above 10000 and are larger than the initial ones. It's 4 bytes vs. 3 bytes.
Don't believe me?

SQL> select min(rownum),max(rownum),sum(vsize(rownum)) from (select * from dual connect by 1000>=level),(select * from dual connect by 20>=level) group by ceil(rownum/10000);

MIN(ROWNUM) MAX(ROWNUM) SUM(VSIZE(ROWNUM))
----------- ----------- ------------------
          1       10000              29801
      10001       20000              39899

Yes... No place for guesses and myth... Everything can be measured... Do you know how many block we need when data in 18 blocks are increased by that ratio? 18 * (4/3) = 24 so we are not bad at all.

 

The fact is that the 18 leaf blocks has only been increased to 20 leaf blocks:

SQL> @ index_fragmentation.sql

        ID ->         ID rows/block bytes/block %free space     blocks free
---------- -- ---------- ---------- ----------- ----------- ---------- -----
     10001 ->      10001        266        4254          47          1 oo
     10267 ->      10267        533        8523                      1
     10800 ->      10800        533        8522                      1
     11333 ->      11333        533        8523                      1
     11866 ->      11866        533        8523                      1
     12399 ->      12399        533        8522                      1
     12932 ->      12932        533        8523                      1
     13465 ->      13465        533        8523                      1
     13998 ->      13998        533        8522                      1
     14531 ->      14531        533        8523                      1
     15064 ->      15064        533        8523                      1
     15597 ->      15597        533        8522                      1
     16130 ->      16130        533        8523                      1
     16663 ->      16663        533        8523                      1
     17196 ->      17196        533        8522                      1
     17729 ->      17729        533        8523                      1
     18262 ->      18262        533        8523                      1
     18795 ->      18795        533        8522                      1
     19328 ->      19328        533        8523                      1
     19861 ->      19861        140        2237          72          1 ooo

and they are all full - except first and last one.

 

This is optimal. Do the same test case and you will see that if you coalesce or shrink the index then the number of blocks will not change. More info about it in a previous blog post: index coalesce vs. shrink vs rebuild

Is it new?

Ok, I've run my tests on 12c and you want to know if it's something new. No it's not new.
Oracle 7.3.3 reuses the deleted space as well:

b2ap3_thumbnail_CaptureORA73index.JPG

It's the same test case except that here I'm with 2k block size.

Index rebuild

Do you think index rebuild can help, or at least is not harmful?

SQL> alter index DEMOPK rebuild;
Index altered.

SQL> @ index_fragmentation.sql

        ID ->         ID rows/block bytes/block %free space     blocks free
---------- -- ---------- ---------- ----------- ----------- ---------- -----
     10001 ->      10001        478        7644           5          1
     10479 ->      10479        479        7659           5          1
     10958 ->      10958        479        7659           5          1
     11437 ->      11437        479        7659           5          1
     11916 ->      11916        478        7644           5          1
     12394 ->      12394        479        7659           5          1
     12873 ->      12873        479        7659           5          1
     13352 ->      13352        479        7659           5          1
     13831 ->      13831        479        7659           5          1
     14310 ->      14310        478        7644           5          1
     14788 ->      14788        479        7659           5          1
     15267 ->      15267        479        7659           5          1
     15746 ->      15746        479        7659           5          1
     16225 ->      16225        479        7659           5          1
     16704 ->      16704        478        7644           5          1
     17182 ->      17182        479        7659           5          1
     17661 ->      17661        479        7659           5          1
     18140 ->      18140        479        7659           5          1
     18619 ->      18619        478        7644           5          1
     19097 ->      19097        479        7659           5          1
     19576 ->      19576        425        6794          15          1

The index rebuild has increased the size of the index. One more leaf block here. Because it has left 5% of free space in each block. And that free space will never be reused because there are no future rows that will go there.

 

Conclusion

Is the free space reused in an index on a sequence - always increasing - when we are purging old data?
Answer is: yes... unless to are doing regular index rebuilds.

EMC World 2015 - Last day at Momentum

Yann Neuhaus - Thu, 2015-05-07 22:52

So Momentum is over, Philippe Schweitzer and I finished with a 4 hours hackaton session. For Philippe the subject was "Developing an EMC Documentum application with REST, AngularJS, Bootstrap, Node.js and Socket.io=" and I choosed "From the Ground Up - Developing an EMC InforArchive Solution".

But the main subject in this post is more to thank all people we met during these four enriching days, EMC people - who hold the sessions, who made demos on the booths, Catherine Weiss - Partner Sales Manager, who introduced us to great people. Also thank you to people from fme, Reveille Software, Flatiron with whom we had good discussions.

The atmosphere was amazing, not only during the working days but also in the evening events organized by EMC Smile

So to be short, Momemtum 2015 was a great and successful journey.

Philippe and Gerard

the fastest way to load 1m rows in postgresql

Yann Neuhaus - Thu, 2015-05-07 13:00
postreslogo.png

There have been several posts on how to load 1m rows into a database in the last days:

Variations on 1M rows insert (1): bulk insert
Variations on 1M rows insert(2): commit write
Variations on 1M rows insert (1): bulk insert - PostgreSQL
Variations on 1M rows insert(2): commit write - PostgreSQL
Variations on 1M rows insert (3): TimesTen

In this post I'll focus on how to prepare a PostgreSQL database for bulk loading in more detail.

Concrete5 CMS

Yann Neuhaus - Thu, 2015-05-07 03:30

Today, a lot of CMS are existing, WordPress, Joomla, Magento, and others, in this blog I will share my experience Concrete5 through a web agency specialized based in Geneva: 8 Ways Media


What Concrete5?

alt


Not only a CMS (Content Management System) open source based on a webserver, it is coded in PHP using a MySQL database, but also it's a great Framework for developers. A simplified system (optional), URL rewriting is present to increase the performance of indexing sites from search engines.

C5 can also be used in the development of Web applications.


C5 also provides, through its content management and user rights, create intranets for companies (small scale in my opinion, it is better for an intranet, stay on a SharePoint or Alfresco).

This CMS is designed to make life easier for the end user, the handling is simple and intuitive.

Advanced management of dynamic websites with modern design, the editing mode is made directly via the FrontEnd, the possibilities are numerous: Drag and Drop, templates, etc ...

INSTALLATION


Verifying Prerequisites


The following components are required to enable concrete5 run correctly: http://www.concrete5.org/documentation/developers/5.7/installation/system-requirements/


To install the tutorial, please refer to the main site: http://www.concrete5.org/documentation/developers/5.7/installation/installation/


FEATURES


The Dashboard manages the properties related to:
 

alt

  • Rights management

alt

 

  • Management of imported content


  alt alt

  • Templates

alt

  • Block, pages

alt

  • Features: video, forms, presentations, blog, guestbook, etc ...

  alt alt

The GUI is configurable, C5 is based on a system with access to a punctilious customization.
The editing of pages, texts and other is done via the FrontEnd as soon as you are logged on as an administrator or with the writing rights on the site. The editing mode has two modes: HTML or "Composer".
      
Versioning is an asset, it means that in case of error, a trace of the old version before changes is easily restorable.
The updates are performed through a simple upload, followed by a single click.

CONCLUSION


Despite the trio "Wordpress - Joomla - Drupal" according to studies, having discovered Concrete5, I recommend it for its very intuitive look and ease of use, on the other hand the developer of communities seem to be active and growing, what facilitates the resolution of "small issues" in cases. Also, exports of all site content in HTML is possible, this could help if you have to change the web server. However, the majority of bases useful plugins have a high cost, the most "design" themes are not free (expense of taste of each) and the support service is paid if the host is not through their bias.
I highly recommend this CMS! Its simplicity and amplitude adaptation to different needs allows the creation of a site with confidence and ease.

EMC World 2015 - Day 3 at Momentum

Yann Neuhaus - Wed, 2015-05-06 18:53

In this post I would like to relay some advices around the upgrade to Documentum 7.2 I have got in the session from Patrick Walsh about "What's New, what's Next: EMC Documentum Platform".

With Documentum 7.2 the processes for the upgrade is more cleaner and there are less restrictions. For instance more upgrade scenarios are documented, that can help us to define the best upgrade path to use.

There was also a slide which listed the following points to take into account when we have to define the right time to upgrade the current installation.

- Features
- Costs
- Dates

Is there not something missing?
On my point of view, there is at least an additional point to consider. When I do the impact assessment to know if we have to upgrade or not and when, I study the list of issues that have been fixed with this new version - if we are impacted or not by them - but also which open issues (who knows an application without bugs? ) are coming with it that are acceptable.

Another helpful information - which can give an insight to customers - is the time to achieve a typical upgrade project.
Documentum considers 6 to 12 months for the planning (evaluation of the release, gathering business requirements, budget approval aso) and 6 to 24 months for the implementation and testing.
Based on that, customers still using the version 6.7 (End Of Support is April 30 2018) should think to upgrade to version 7.x

To facilitate the upgrade, the client and platform do not have to be upgraded in one big bunch. the D6.x clients can be upgraded first and then the 6.7 platform.

Documentum introduced also "Phased Service Upgrades". For instance, we have the possibility to upgrade xPlore from version 1.3 to 1.5 in phase one and a couple of months later in phase two the platform from 6.7 to 7.2.
Or vice-versa, we start with the platform and later on we upgrade xPlore.
With this approach, having de-coupled services, we have more flexibility and less downtime.

And now, last but not least, the aim for the future is to have no downtime at all during the upgrade. THIS would be wonderfull !

 

EMC World Las Vegas – Momentum 2015 third day D2 news

Yann Neuhaus - Wed, 2015-05-06 17:36

This was a more day of networking with EMC partner contact and third party software editors. On the other side I attended a session about D2 news and what is comming next.

 

EMC divided D2 enhancements by 3 main major themes.

 

First was about productivity and a modern look and feel with:

 

    • graphical workflow widget

    • drag and drop from D2 to Desktop

    • better browsing with enhanced facet navigation

    • multi-document support in workflows

    • faster content transfer with new D2-BOCS for distributed environments

 

Second was about Information integrity with

    • more SSO implementation support, like Tivoli

    • folder import with inner documents as virtual document

 

Then finally about software agility with

 

    • ease of new user on-boarding with default configuration settings

    • PDF export of D2 configuration for multi environment comparison

 

I hope you enjoyed reading this summary of today at EMC world – Momentum 2015. Thanks for your attention.

getting started with postgres plus advanced server (4) - setting up the monitoring server

Yann Neuhaus - Wed, 2015-05-06 04:26

If you followed the first, second and the third post the current ppas infrastructure consists of a primary database, a hot standby database and a backup and recovery server.

EMC World Las Vegas – Momentum 2015 second day

Yann Neuhaus - Tue, 2015-05-05 17:05

For second day of conferences, I attended a first session about migration from Webtop to D2 or xCP. Then we attended 2 about Documentum platform performances tuning.

 

We could see EMC actually putting efforts to take benefits of Open Sources software. They started to package whole platform component by component into Dockers containers. Hope is to try to simplify upgrades of them from one version to another.

 

They also invited audience to migrate to Documentum 7.1/7.2 because lot of performances enhancements were done, especially for multi-core CPU support and better session pooling management, which last topic took us some maintenance time last months.

 

Key advantage of migration from Webtop to D2 or xCP is they capability to integrate a lot of business case scenarios out of the box. For instance, when a customer wants to move customizations into D2, by experience they said we could reach up to 50% features coverage by software configuration instead of coding. A great saving of time and money as well as maintenance costs over the time and upon further upgrades.

 

Finally they also stated EMC is providing few tools to ease processes of migration for former live science appliance to actual one and webtop to D2 and xCP.

 

For Documentum platform performances tuning, I invite you to read Gérard's blog following this link.

 

I hope you enjoyed reading summary of today at EMC world – Momemtum 2015. Thanks for your attention.

EMC World 2015 - Day 2 at Momentum

Yann Neuhaus - Tue, 2015-05-05 16:53

Second day in this amazing event. There are not only general and presentation sessions you can also participate on a so called "Hands-on Lab". The subject was the "EMC Documentum Platform Performance Tuning". So learning by doing is also a good opportunity you can use at Momentum to enhance your skills.


The session covered the performance tuning using Fiddler for the HTTP requests, the DFC trace, sql traces and how to use the related execution plan, and at the end, how to tune xPlore but only for the ingestion phase meaning the indexing of the documents.The tuning of the fulltext search was not addressed.

Most of the tips I learned was on the xPlore side, which parameters to set to increase the performances or to avoid some errors due timeouts for instance. I skipped more or less the Database tuning, why? that's not acceptable would you say. Because we have the experts at dbi service to do this kind of tuning!

So let's me give you some information.

DFC trace

In df.properties add :

dfc.tracing.enable=true

dfc.tracing.verbose=true

dfc.tracing.max_stack_deph=0

dfc.tracing.include_rpcs=true

dfc.tracing.mode=compact

dfc.tracing.include_session_id=true

dfc.tracing=c:\dctm\trace


dfc.tracing.enable can be set from false to true and after a couple of seconds the trace file will be created. Of course once the value is set to false, the tracing is stopped.

xPlore tuning

I would recommend to rely to the documentation but here you can find some tips:
- disable fulltext for a specific format by setting can_index=false
- filter which document is not indexed by excluding cabinets or folder or even document types
- reduce the online rebuild index max queue size
- reduce the number of indexing threads and CPS request threads
- reduce the number of documents processed simultaneously by the CPS internal processing queue
- increase the number of CPS daemon processes

WARNING: each parameter can have some drawbacks, so take care when you change this values

So this was on a lab, let's apply this on real case at customer sites!

SQL Server vNext becomes SQL Server 2016

Yann Neuhaus - Tue, 2015-05-05 15:33

 

Satya Nadella, CEO of Microsoft has announced at the Microsoft Ignite keynote in Chicago that SQL Server 2016 public preview will come this summer and you can be sure that at dbi services we’re looking forward to take a look at under the hood and test this next major release.

Amid all the good news, here those who have focus my attention:

 

blog_41_-_1_-_always_encrypted

 

First of all, because we’re working in Switzerland, security is a main concern at some customer places and security features like Always Encrypted arouses all of our interest. Indeed, a big improvement here concerns the protection of sensitive data performed at rest and also in motion. In addition, the encryption key is not anymore stored in the SQL Server instance and now resides with the application in the customer environment in a secure location. Microsoft will also provide support for row level security and dynamic data masking. The latter will probably focus the eyes of some of our customers. I Think we will hear about these new features the next few months ...

 

blog_41_-_2_-_stretch_tables

 

Another interesting feature is certainly stretch tables but in fact this is not really a news because it was firstly announced during the previous Pass summit. This is clearly a hyper-scale cloud feature, mainly focused on hybrid scenarios. You can think this feature as a kind of partitioning table where historical data (cold data) are dynamically stretched in the cloud.

 

blog_41_-_3_-_inmemory_analytics

 

When we talked about SQL Server 2014 we automatically think about hekaton. Is that will be the same with SQL Server 2016 and new real-time operational Analytics & In-Memory OLTP feature? At the moment it is too early to speculate but let’s say that this feature will interest a lot of folks including the dbi Microsoft team :-)   The idea seems to be pretty simple: combining the existing OLTP In-memory tables feature and the in-memory columnstore capabilities to obtain a satisfying mix that will increase the scope of possible scenarios (OLTP and analytics). This feature needs to be tested for sure!

 

 

blog_41_-_4_-aags

 

Another good news that concerns high-availability topic and especially availability groups which will now support MsDtc and SSIS. AGs will able to use native load balancing for reporting workload against the secondaries. In addition, it will be possible to use 3 synchronous replicas and we will be able to configure a failover policy based on database health.

 

alt

Finally, you will note the integration of Polybase into SQL Server (previously Polybase was available only with AFS) to extended access to both unstructured and structured data by using T-SQL language. Thus, we come full circle: transaction processing, data warehousing, BI activities and now big data capabilities handled by the same engine.

You can download the SQL Server 2016 datasheet here.

Probably other good news will come until the final release of SQL Server 2016 and we will have the opportunity to blog on them. So stay connected!

 


When tempdb database may cause indirect issues

Yann Neuhaus - Tue, 2015-05-05 13:23

A couple of weeks ago, I had an interesting discussion with one of my friend that faced a weird issue with a SQL Server instance that lacked worker threads. Because I can’t use his own information I decided to reproduce the same issue in order to share with you some interesting information. So the next part of this blog post refers exclusively to my own test but it represents exactly the real issue encountered by my friend.

Let’s set the scene: in the SQL Server error log we found the following records related to our issue:

 

2015-04-16 14:06:16.54 Server     New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 300 seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time. Use the "max worker threads" configuration option to increase number of allowable threads, or optimize current running queries. SQL Process Utilization: 0%. System Idle: 94%.

2015-04-16 14:11:16.74 Server     New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 600 seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time. Use the "max worker threads" configuration option to increase number of allowable threads, or optimize current running queries. SQL Process Utilization: 0%. System Idle: 92%.

 

In the same time, we found an event related to the generation of a SQL Server dump file triggered by the deadlock schedulers monitor as follows:

 

2015-04-16 14:06:04.66 Server     **Dump thread - spid = 0, EC = 0x0000000000000000 2015-04-16 14:06:04.83 Server     ***Stack Dump being sent to C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\Log 2015-04-16 14:06:04.83 Server     * ******************************************************************************* 2015-04-16 14:06:04.83 Server     * 2015-04-16 14:06:04.83 Server     * BEGIN STACK DUMP: 2015-04-16 14:06:04.83 Server     *   16/04/15 14:06:04 spid 6392 2015-04-16 14:06:04.83 Server     * 2015-04-16 14:06:04.83 Server     * Deadlocked Schedulers 2015-04-16 14:06:04.83 Server     * 2015-04-16 14:06:04.83 Server     * ******************************************************************************* 2015-03-27 01:05:02.83 Server     * ------------------------------------------------------------------------------- 2015-03-27 01:05:02.83 Server     * Short Stack Dump

 

I began my investigation by having a first look at the SQL Server error log file but I didn’t find any relevant information. So, I decided to move on the analysis of the dump file.

My first idea was to look at the runaway threads at the moment of the dump file generation (!runaway command). In fact, I suspected first a high-CPU workload because I didn’t get other information from my friend. Here what I found:

 

blog_39_-_1_-_runaway

 

I began my investigation by having a first look at the SQL Server error log file but I didn’t find any relevant information. So, I decided to move on the analysis of the dump file.

My first idea was to look at the runaway threads at the moment of the dump file generation (!runaway command). In fact, I suspected first a high-CPU workload because I didn’t get other information from my friend. Here what I found:

 

blog_39_-_2_-_nb_of_running_threads

 

557 threads in use at the moment of the generation of the dump file. According to the Microsoft documentation here and the concerned SQL Server infrastructure (4 logical processors and max worker thread option = 0), the number of running threads are greater than maximum of configured SQL Server threads equal to 512 in this case.

 

SQL Server detected 2 sockets with 2 cores per socket and 2 logical processors per socket, 4 total logical processors;

 

We found a lot of thread patterns that seem blocked by a database lock.

 

blog_39_-_3_-_call_stacks

 

Note in this case that SQL Server has decided to switch off the “tied” thread but it is still waiting to be signaled that the database lock is available to continue …

Another piece of interesting call stack is the following:

 

blog_39_-_4_-_call_stacks_2

 

 

This above call stack seems to concern the SQL Server start-up process. Let’s take a look at the functions like sqlmin!DBMgr::OpenUnstartedDatabase, sqlmin!DBMgr::StartupDB or sqlmin!IMEDUtil::SynchronzeWithResourceDB .We can also notice the use of the function sqlmin!StartUp::WaitForTempdbReady that seems to be related to tempdb database readiness. After that, the concerned thread is also waiting on the database lock to be available and we don’t see any functions that state tempdb is ready at the moment of the dump file generation.

 

Well, to summarize we are facing a general locking issue inside SQL Server process that has consumed all of available threads and we may think that tempdb readiness has something to do with this our blocking issue.

So now, let’s go back to the SQL Server error log and let’s have a deeper look at the SQL Server error log event. At this point, I remembered the famous Bob Ward’s session at Summit 2011 about tempdb database in which he explained the importance of tempdb database during the SQL Server start-up process because even if SQL Server is ready for user connections. Indeed, they have to wait until tempdb has really started and it makes sense because we can wonder what’s going on if tempdb database cannot start correctly. Do we allow user connections to perform their work? As a reminder, the unavailability of tempdb is considered as a SQL Server shutdown event.

In my context we found effectively that the tempdb database was ready only 40 minutes after SQL Server was ready for user connections as follows:

 

2015-04-16 13:52:38.43 spid5s     Starting up database 'master'. 2015-04-16 13:53:05.76 spid9s     Starting up database 'model'. 2015-04-16 13:53:25.02 spid9s     Clearing tempdb database. 2015-04-16 13:51:48.13 spid13s     SQL Server is now ready for client connections. This is an informational message; no user action is required. 2015-04-16 14:32:12.28 spid9s     Starting up database 'tempdb'. 2015-04-16 14:32:15.71 spid9s     Recovery completed for database tempdb (database ID 2) in 2 second(s)

 

The $100 dollar question: Why tempdb started so slowly? Well, I didn’t have all information but my friend told me that they had a storage maintenance at the time … maybe we may correlate these two events together but without any further investigation, it is not possible to respond correctly about it.

Anyway it was very interesting to see how the tempdb database may trigger an escalade of issues in this case. There are probably a lot of ways to reproduce this issue.

Finally let me finish my blog post by explaining my test scenario. In fact to simulate the same issue than my friend, I created first a very big user table inside the model database to considerably slow down the start-up of my tempdb database. Then I configured tempdb with big database files in size without instant file initialization (zeroing is mandatory in this case for all of database files). In the same time, I used ostress tool to simulate roughly 600 running threads. Finally I let the magic happens ...

I hope this blog will help you in your database administrator daily work.

Understand the Lifecycle of Alfresco Nodes

Yann Neuhaus - Tue, 2015-05-05 12:19


I guess you all already know what a lifecycle is. We born, we live and we die... In fact, it's exactly the same for Aflresco Nodes! Well, at least from an end user point of view. But what is really going on behind that? This is what I will try to explain in this post.

First of all what is an Alfresco Node? For most people, a Node is just a document stored in Alfresco but in reality it's much more than that: everything in Alfresco is a Node! A Node has a type that defines its properties and it also have some associations with other Nodes. This is a very short and simplified description but we don't need to understand what exactly a Node is to understand the lifecycle process. So as said above, Alfresco Nodes have their own lifecycle but it's a little bit more complicated than just three simple steps.

Please note that in this post, I will use $ALF_HOME as a reference to the location where alfresco has been installed (e.g. /opt/alfresco-4.2.c) and $ALF_DATA as a reference to the alf_data location. By default the alf_data folder is $ALF_HOME/alf_data/.


I. Creation


In this post I will use a document as an Alfresco Node to easily understand the process. So this lifecycle start with the creation of a new document named "Test_Lifecycle.docx" with the following creation date: Febrary the 1st, 2015 at 16:45.

When a document is created in Alfresco, three things are done:

  • File System: the content of this file is stored on the Alfresco "Content Store". The Content Store is by default under $ALF_DATA/contentstore. This file is actually put somewhere under this folder that depends on the creation time and an ID is given to this file. For our file, it would be: $ALF_DATA/contentstore/2015/2/1/16/45/408a6980-237e-4315-88cd-6955053787c3.bin.
  •  Database: the medatada of this file are stored on the Alfresco Database. In fact in the DB, this document is mainly referenced using its NodeRef or NodeID. This NodeRef is something we can see on the Alfresco Web Interface from the document details page (web preview of a document): http://HOSTNAME:PORT/share/page/document-details?nodeRef=workspace://SpacesStore/09a8bd9f-0246-47a8-9701-29436c7d29a6. Please be aware that the NodeRef contains an UUID but it's not the same that the ID on the Content Store side... Moreover, there is a property in the DB that link the NodeRef to the Content Store's ID for Alfresco to be able to retrieve the content of a file.
  • Index: an index is created for this file in the Search engine (can be Lucene for older versions of Alfresco or Solr for newer versions). This index is in the "workspace" store.



II. Update, Review, Approve, Publish, aso...


Once the document is created, his life really begins. You can update/review/approve/publish it manually or automatically with different processes. All these actions are part of the active life of a document. From an administration point of view, there isn't that much to say here.


III. Deletion - User level

 
When a document isn't needed anymore, for any reason, a user with sufficient permissions is able to delete it. For our example, let's say that a user deleted our file "Test_Lifecycle.docx" using the Alfresco Share Web Interface on Febrary the 20th, 2015 at 15:30 (19 days after creation). When using the Web Interface or Web Services to delete a document, the "nodeService.deleteNode" method is called. So what happened to our "three things"?

  • FS: nothing changed on the Content Store. The file content is still here.
  • DB: on the DB side, the NodeRef changed from workspace://SpacesStore/09a8bd9f-0246-47a8-9701-29436c7d29a6 to archive://SpacesStore/09a8bd9f-0246-47a8-9701-29436c7d29a6 (the "store_id" field changed on the "alf_node" table).
  • Index: same thing for the search index: the index is moved from the "workspace" store to the "archive" store.


Actually, when a user deletes a document from a Web Interface of Alfresco, the document is just moved to a "global trashcan". By default all users have access to this global trashcan in Alfresco Explorer to restore the documents they may have deleted by mistake. Of course they can't see all documents but only the ones related to them. On Alfresco Share, the access to this global trashcan is configured in the share-config.xml file and by default on most versions, only administrators have access to it.

The only way to avoid this global trashcan is to programmatically delete the document by applying the aspect "cm:temporary" to the document and then call the "nodeService.deleteNode" on it. In that way, the document is removed from the UI and isn't put in the global trashcan.


 IV. Deletion - Administration level

 
What I describe here as an "Administration level" is the second level of deletion that happen by default. This level is the deletion of the document from the global trashcan. If the document is still in the global trashcan, Administrators (or users if you are using Alfresco Explorer) can still restore the document. If the document is "un-deleted", then it will return exactly where it was before and of course metadata & index of this document will be moved from the "archive" store to the "workspace" store to return in an active life.

On April the 1st, 2015 at 08:05 (40 days after deletion at user level), an administrator decides to remove "Test_Lifecycle.docx" from the global trashcan. This can be done manually or programmatically. Moreover, there are also some existing add-ons that can be configured to automatically delete elements in the trashcan older than XX days. This time, the "NodeArchiveService.purgeArchiveNode" method is called (archiveService.purge in some older Alfresco versions). So what happened to our "three things" this time?

  • FS: still nothing changed on the Content Store. The file content is still here.
  • DB: on the DB side, the document is still there but some references/fields (not all) are removed. All references on the "alf_content_data" table are removed when only some fields are emptied on the "alf_node" table. For Alfresco 4.0 and below, the "node_deleted" field on the table "alf_node" is changed from 0 to 1. On newer versions of Alfresco, the "node_deleted" doesn't exist anymore but the QNAME of the node (field "type_qname_id") on the "alf_node" table is changed from 51 ("content") to 140 ("deleted"). So the Node is now deleted from the global trashcan and Alfresco knows that this node can now be safely deleted but this will not be done now... Once the "node_deleted" or "type_qname_id" is set, the "orphan_time" field on the "alf_content_url" table for this document is also changed from NULL to the current unix timestamp (+ gmt offset). In our case it will be orphan_time=1427875500540.
  • Index: the search index for this Node is removed.


As you can see, there are still some remaining elements on the File System and in the DB. That's why there is a last step in our lifecycle...


V. Deletion - One more step.


As you saw before, the document "Test_Lifecycle.docx" is now considered as an "orphaned" Node. On Alfresco, by default, all orphaned Nodes are protected for 14 days. That means that during this period, the orphaned Nodes will NOT be touched at all. Of course this value can be changed easily on Alfresco configuration files. So what happen after 14 days? Well in fact every day at 4am (again, by default... can be changed), a scheduled job (the "contentStoreCleaner") scan Alfresco for orphaned Nodes older than 14 days. Therefore on April the 15th, 2015 at 04:00, the scheduled job runs and here is what it does:

  • FS: the content file is moved from $ALF_DATA/contentstore/ to $ALF_DATA/contentstore.deleted/
  • DB: on the DB side, the document is still here but the line related to this document on the "alf_content_url" table (this table contains the orphan_time and reference to the FS location) is removed.
  • Index: nothing to do, the search index was already removed.


You can avoid this step where documents are put on the "contentstore.deleted" folder by setting "system.content.eagerOrphanCleanup=true" in the alfresco-global.properties configuration file. If you do so, after 14 days, the document on the File System is not moved but will be deleted instead.


VI. Deletion - Still one more step...!


As said before, there are still some references to the "Test_Lifecycle.docx" document on the Alfresco Database (especially on the "alf_node" table). Another scheduled job, the nodeServiceCleanup runs every day at 21:00 to clean everything that is related to Nodes that has been deleted (orphaned nodes) for more than 30 days. So here is the result:

  • FS: the content file is still on the $ALF_DATA/contentstore.deleted/ folder
  • DB: the DB is finally clean!
  • Index: nothing to do, the search index was already removed.



VII. Deletion - Oh you must be kidding me!?


So many steps, isn't it! As saw before, the only remaining thing to do is to remove the content file from the $AL_DATA/contentstore.deleted/ folder. You probably think that there is also a job that do that for you after XX days but it's not the case, there is nothing in Alfresco that deletes the content file from this location. In consequences, if you want to clean the File System, you will have to do it by yourself.

On Unix for example, you can simply create a crontab entry:

50 23 * * * $ALF_HOME/scripts/cleanContentStoreDeleted.sh


Then create this file with the following content:

#!/bin/sh

CS_DELETED=$ALF_DATA/contentstore.deleted/

# Remove all files from contentstore.deleted older than 30 days
find $CS_DELETED -type f -mtime +30 | xargs rm 2> /dev/null

# Remove all empty folders from contentstore.deleted older than 60 days
find $CS_DELETED -type d -mtime +60 -empty | xargs rm -r 2> /dev/null


Please be aware that you should be sure that the folder "$ALF_DATA/contentstore.deleted" exist... And when I say that, I mean YOU MUST ABSOLUTELY BE SURE that it exists. Please also never remove anything under the "contentstore" folder and never remove the "contentstore.deleted" folder itself!

You may wonder why the DB isn't cleaned automatically when the trashcan is cleaned and why the file content is also kept 14 days... Well I can assure you that there are several reasons and the principal one is for backup/restore performance concerns. I will not explain it in details but basically as the file content isn't touched for 14 days by default, that means that you can restore your database up to 14 days in the past and your database will still be consistent with the File System without to restore the FS! Of course if you just do that you will lose the documents uploaded/changed in the last 14 days because your database wasn't aware of these files 14 days ago. But you can just backup/restore the content files created in the last 14 days with an incremental backup.

E.g.: Today (05-May-2015), I want to backup the FS (only the last 14 days), then I will have to backup all folders inside $ALF_DATA/contentstore/2015/5 AND I will also have to backup all folders inside $ALF_DATA/contentstore/2015/4 with a folder name bigger or equal than 30 (days in Apr) + 5 (days in May) - 14 = 21.


I hope this post was clear enough because it's true that it can be hard to understand everything regarding the lifecycle of Alfresco Node and to deal with it properly.

In-Memory OLTP: Hash and Range indexes

Yann Neuhaus - Tue, 2015-05-05 02:04

Since SQL Server 2014 CTP2, Microsoft has introduced a new kind of index which is the Range index. We have now two ways for indexing an In-Memory table: using either an Hash index or a Range index.

These two indexes are slightly different.
In fact, a Hash index is a set of buckets, 8-bytes memory pointers, which points to the actual row data or row chain data. SQL Server 2014 uses a hash function to map values to buckets. To minimize Hash collisions the number of bucket should be twice the number of index values. A Hash collision occurs when the Hash function returns the same bucket value for two different data value for example “Audi” and “Mercedes”. Hash indexes are optimized for equality predicates.

In contrast, for Range indexes, the leaf level of a Bw-Tree gives you a pointer to the first data row in the row chain. Like with hash indexes, rows with the same indexed value form a row chain. This kind of index is particularly efficient with inequality predicates like ‹ or › .If you want to know more about Range Indexes, please read the excellent blog of my colleague David Barbarin.

To compare these two kind of indexes I will create a new database with first two tables, one In-Memory table with a hash index on a DateTime column and another one (disk-based table) with a non-clustered index on the same column. I will fill these tables with one millions records. Here is the script I used:

 

-- database creation

use master

go


if exists(select * from sys.databases where name = 'Test_InMemoryOLTP_Optimized')

   drop database Test_InMemoryOLTP_Optimized

go


create database Test_InMemoryOLTP_Optimized

go


--create a disk table

use Test_InMemoryOLTP_Optimized

go


if exists(select * from sys.objects where name = 'Person_OnDisk')

   drop table dbo.Person_OnDisk

go


create table dbo.Person_OnDisk

(

Person_OnDisk_ID int not null primary key,

Person_OnDisk_Name nvarchar(200) not null,

Person_OnDisk_Date datetime not null,

index Person_OnDisk_Person_OnDisk_Date nonclustered (Person_OnDisk_Date)

)


--enable database for memory optimized tables

-- add memory_optimized_data filegroup

alter database Test_InMemoryOLTP_Optimized

   add filegroup Test_InMemory_mod contains MEMORY_OPTIMIZED_DATA

go


--add container to the filegroup

alter database Test_InMemoryOLTP_Optimized

   add file (name='InMemory_mod', filename='d:\InMemory\InMemory_mod')

   to filegroup Test_InMemory_mod

go


use Test_InMemoryOLTP_Optimized

go


if exists(select * from sys.objects where name = 'Person_InMemory')

   drop table dbo.Person_InMemory

go


create table dbo.Person_InMemory

(

Person_InMemory_ID int not null primary key

       nonclustered hash with (bucket_count = 2000000),

Person_InMemory_Name nvarchar(200) COLLATE Latin1_General_100_BIN2 NOT NULL,

Person_InMemory_Date datetime not null index Person_InMemory_Person_InMemory_Date

nonclustered hash (Person_InMemory_Date) with (bucket_count = 2000000)

)

with (memory_optimized = on, durability = schema_and_data)


--fill in disk table

set nocount on

go

begin tran

declare @cpt int = 0

While @cpt < 1000000

begin

   insert into dbo.Person_OnDisk values

       (@cpt, 'Name_'+cast(@cpt as varchar(10)), dateadd(dd,round(@cpt/10000,0,1),'2014-01-01'))

       set @cpt += 1

end

commit


--fill in In-Memory table

set nocount on

go

begin tran

declare @cpt int = 0

While @cpt < 1000000

begin

   insert into dbo.Person_InMemory values

       (@cpt, 'Name_'+cast(@cpt as varchar(10)), dateadd(dd,round(@cpt/10000,0,1),'2014-01-01'))

       set @cpt += 1

end

commit

 

First remark, filling in the In-Memory table is three time faster than for disk-based table, respectively on my machine 5 seconds for the former against 15 seconds for the latter. You can also see that I decided to persist my In-Memory table data as I used the durability option schema_and_data instead of schema_only which means that after a restart of my SQL Server service the data will not be persistent, so definitively lost.

 

Now, I will try to search for records where date is equal to 14-01-2014 and then I will compare results between my In-Memory and my disk-based tables. I will use IO and Time statistics options to get a better view of each query performance:

 

-- use equality operator with disk table and In-Memory table with Hash index

USE Test_InMemoryOLTP_Optimized

GO

 

SET NOCOUNT ON

SET STATISTICS IO ON

SET STATISTICS TIME ON

 

SELECT *

FROM dbo.Person_OnDisk

WHERE Person_OnDisk_Date = '2014-01-14'

 

SELECT *

FROM dbo.Person_InMemory

WHERE Person_InMemory_Date = '2014-01-14'

 

The result of those queries is:

 

Table on disk:

Table 'Person_OnDisk'. Scan count 1, logical reads 5812, physical reads 3, read-ahead reads 5805, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

 

SQL Server Execution Times:

   CPU time = 31 ms, elapsed time = 1631 ms.

 

 

Table In-Memory:

SQL Server Execution Times:

   CPU time = 0 ms, elapsed time = 128 ms.

 

The disk table made multiple logical reads (5812), three physical reads and 5805 read-aheads whereas the In_Memory table as it is in memory did not perform any disk I/O’s activity.
Concerning the execution time, my disk-based table consumed 31 ms of CPU against and ran in 1631 ms against no CPU and 128 ms for my In-Memory table...
In-Memory table is definitively faster than my disk-based table.

 

Let see now if I used an inequality predicate. So I am using the same queries with an ›:

 

SET NOCOUNT ON

SET STATISTICS IO ON

SET STATISTICS TIME ON

 

SELECT *

FROM dbo.Person_OnDisk

WHERE Person_OnDisk_Date > '2014-01-14' and Person_OnDisk_Date '2014-02-20'

 

 

SELECT *

FROM dbo.Person_InMemory

WHERE Person_InMemory_Date > '2014-01-14' and Person_InMemory_Date '2014-02-20'

 

 

The result is:

 

Disk table:

Table 'Person_OnDisk'. Scan count 1, logical reads 5812, physical reads 3, read-ahead reads 5805, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

 

SQL Server Execution Times:

   CPU time = 171 ms, elapsed time = 3150 ms.

 

 

In-Memory table:

SQL Server Execution Times:

CPU time = 204 ms, elapsed time = 2153 ms.

 

I/O’s for both tables did not change, but concerning the In-Memory table we saw that the gain is not very interesting compare to equality predicate see before. The Hash index is not tuned for inequality predicate. I will now change the Hash index of my In-Memory table to a Range index. To do it, I will have to drop and recreate my table, as it is not possible to alter an In-Memory table, and to reload data.

 

--Change the Hash index by a Range index

if exists(select * from sys.objects where name = 'Person_InMemory')

   drop table dbo.Person_InMemory

go

 

create table dbo.Person_InMemory

(

Person_InMemory_ID int not null primary key

       nonclustered hash with (bucket_count = 2000000),

Person_InMemory_Name nvarchar(200) COLLATE Latin1_General_100_BIN2 NOT NULL,

Person_InMemory_Date datetime not null index Person_InMemory_Person_InMemory_Date

nonclustered (Person_InMemory_Date)

)

with (memory_optimized = on, durability = schema_and_data)

 

 

In-Memory table:

 

SQL Server Execution Times:

   CPU time = 47 ms, elapsed time = 2155 ms.

 

The CPU time is now more than four time lower with a Range Index compare to a Hash index.
If I take a look at the execution plan I can see the optimizer is using my range index defined in my script with an index seek operation against a table scan when using an hash index with an equality operator.

 

Before creating an In-Memory table remember that you will have to think about which kind of predicates will be used in queries to choose the right index: Hash or Range or even both. In addition, don’t forget that an In-Memory table accepts a maximum of eight indexes. Finally, if you need to create another index for an In-Memory table you will have to drop and recreate the table.
I hope this blog will help. If you want to know more about SQL Server 2014 In-Memory OLTP please joins us to our next Event in June, details and inscription here.

Variations on 1M rows insert (3): TimesTen

Yann Neuhaus - Mon, 2015-05-04 23:00

In the previous blog post I measured how inserting 1 million rows takes 5 minutes instead of 45 seconds when we commit at each row inserted. In Oracle we can improve that down to 2 minutes if we can accept (and manage) the loss of transactions following an instance crash.
Let's go further. For OLTP that need to face a high rate of transactions, Oracle has TimesTen which is a Database In-Memory - meaning that it is optimized for in-memory structures. My testcase has a 1 million rows table that can fit in memory, so let's try it.

Test case

I am still using the Oracle Developer Day 11g VM which has TimesTen installed.

I've configured my database in /home/oracle/app/oracle/product/TimesTen/tt1122/info/sys.odbc.ini

[sampledb_1122]
Driver=/home/oracle/app/oracle/product/TimesTen/tt1122/lib/libtten.so
DataStore=/home/oracle/app/oracle/product/TimesTen/tt1122/info/DemoDataStore/sampledb_1122
PermSize=400
TempSize=320
PLSQL=1
DatabaseCharacterSet=US7ASCII
I used the sample one but I have increased the size of memory to 400MB and 320MB for PERM_ALLOCATED_SIZE and TEMP_ALLOCATED_SIZE.

ttIsql

So I run the TimesTen command line interpreter:

[oracle@localhost ~]$ ttisql

Copyright (c) 1996-2011, Oracle.  All rights reserved.
Type ? or "help" for help, type "exit" to quit ttIsql.
check the version
Command> version
TimesTen Release 11.2.2.2.0
connect to my database
Command> connect "dsn=sampledb_1122";
Connection successful: DSN=sampledb_1122;UID=oracle;DataStore=/home/oracle/app/oracle/product/TimesTen/tt1122/info/DemoDataStore/sampledb_1122;DatabaseCharacterSet=US7ASCII;ConnectionCharacterSet=US7ASCII;DRIVER=/home/oracle/app/oracle/product/TimesTen/tt1122/lib/libtten.so;PermSize=40;TempSize=32;TypeMode=0;
(Default setting AutoCommit=1)
and I want to manage commits myself
Command> set autocommit 0;
and then I'm doing the same as in previous posts:
Command> set echo on;

Command> create table DEMO ("id" number primary key, "text" varchar2(15), "number" number);

Command> set timing on;
There is something that is very nice about TimesTen: it supports PL/SQL. I'm running exactly the same as I did in Oracle:
Command> declare
 type people_array is varray(12) of varchar(15);
 people people_array := people_array( 'Marc', 'Bill', 'George', 'Eliot', 'Matt', 'Trey', 'Tracy','Greg', 'Steve', 'Kristina', 'Katie', 'Jeff' );
 people_count number :=people.COUNT;
 n number;
begin
 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit;
 end loop;
 commit;
end;
/

PL/SQL procedure successfully completed.

Execution time (SQLExecute) = 21.010554 seconds.
21 second. Yes that's fast...

Durable commits

Do you remember how we improved the commits in the last post? When accepting the risk to have non durable commits the response time was faster on Oracle database because we don't have to wait for persistence of commits.

Here in TimesTen, it's an In-Memory database and by default transactions are not durable. Transactions are logged for rollback, but not for recovery. Which means that nothing goes to disk. If the instance crashes, data is lost. If I want to have the durability of ACID properties then we have to write transaction logging to disk.
This is activated by the following property in the sys.odbc.ini:

DurableCommits=1;

I connect:
Command> connect "dsn=sampledb_1122";
Connection successful: DSN=sampledb_1122;UID=oracle;DataStore=/home/oracle/app/oracle/product/TimesTen/tt1122/info/DemoDataStore/sampledb_1122;DatabaseCharacterSet=US7ASCII;ConnectionCharacterSet=US7ASCII;DurableCommits=1;DRIVER=/home/oracle/app/oracle/product/TimesTen/tt1122/lib/libtten.so;PermSize=40;TempSize=32;TypeMode=0;
(Default setting AutoCommit=1)
and run exactly the same as before:
PL/SQL procedure successfully completed.

Execution time (SQLExecute) = 31.316902 seconds.
It's longer but still faster than the Oracle database. TimesTen is optimized for that.

Conclusion

Note that we have the durability here, thanks to the persistence of transaction log, as long as we set DurableCommits=1. However, in case of instance crash, the recovery may be long because lot of transaction must be re-done. It's an In-Memory database, it's fast when it's up but you don't have the same availability than a database that is optimized to store data on disk. You have to tune the checkpoints in order to balance between the performance and the availability.

TimesTen In-Memory database can be a good solution to accelerate some use-cases. Its compatibility with Oracle SQL and PL/SQL is very good. You can install it standalone or as a cache for Oracle Database. Licencing is 50% of Oracle Enterprise Edition.

TimesTen is also called 'Database In-Memory'. Which is a bit misleading because Oracle has also the 12c In-memory option which is very different. With competitors, TimesTen can be compared to SQL Server In-Memory OLTP. Because it's a unique event - a cross technology one (MSSQL, ORA, HANA) - I don't hesitate to link again to our free Event about In-Memory: boost your IT performance! In Switzerland (Lausanne, Basel and Zürich).

Talking about cross technology expertise, don't miss Daniel's variation on 1M rows insert into PostgreSQL and with synchronous commit .

EMC World Las Vegas – Momentum 2015 first day

Yann Neuhaus - Mon, 2015-05-04 18:34

Before starting talking about EMC event, I just would like to share my feeling about the city it takes place. It is Las Vegas, ok, what an amazing, impressive place! On top of that, we are in one of most beautiful hotels on the strip ! Congratulations to EMC and thanks for this attention.

 

For this first day of conferences, I decided to attend a first session about xCP platform. I use to manage content with D2, and wondered to know a few background of its brother – process oriented – xCP. Then I will explain a bit further news and forecasts about underlaying content management platform, Documentum itself.

 

xCP is one of leading application EMC wants to promote with D2 within enterprise content management system. In summary, xCP is described as a “rapid application development platform”. It meas it helps developers and business analyst to build application with lowering as much as possible real development works by providing an “extensive platform for building case management and business processes solution”.

 

In fact it aims to graphically build up applications by putting together several items, like forms and operational bricks, organized through processes for providing functionalities business teams are looking for. Such approach also aims to reduce development costs and maintainability over the time for more complex applications than only faceted records management.

 

Meanwhile, EMC released Documentum 7.2 platform with its several clients, D2 4.5, xCP 2.2 and Webtop 6.8.

 

In this release we can see several improvement areas like for security. With Documentum 7.2, we can now store and transmit contents using AES-256 bits encryption, considered by US NSA good enough as Top Secret protection criteria for the moment.

 

This version also provides enhanced capability through REST web services for searches, facet navigation, batches transactions, and indeed, further integration extensibility for third party software.

 

xPlore 1.5 also provides its own growing set of functionalities, like cross-repository subscription, privacy options, enhanced language words splittings and improved warm up. It is also good to know it keeps support with backwards compatibility to Documentum 6.7 SP2.

 

Then for next upgrades, EMC also provides higher-level migration tools like one for automatic encryption upgrade in Documentum.

 

This day was also very rich in terms of EMC corporate communication, for overall products policies and strategies for coming years and releases.

 

I hope you enjoyed reading this short summary of first day at EMC world – Momemtum 2015, and would like to thank you for your attention.

Variations on 1M rows insert(2): commit write - PostgreSQL

Yann Neuhaus - Mon, 2015-05-04 16:04
Franck was quite fast with his second post Variations on 1M rows insert(2): commit write. Lets see what we can do on the PostgreSQL side.

Variations on 1M rows insert (1): bulk insert - PostgreSQL

Yann Neuhaus - Mon, 2015-05-04 11:30

When I read Franck's post about Variations on 1M rows insert (1): bulk insert I thought doing quite the same in PostgreSQL might be interesting. Lets start by using the same test tables, one using a primary key and the other one without a primary key:

create table DEMO ("id" int , "text" varchar(15), "number" int);

create table DEMO_PK ("id" int , "text" varchar(15), "number" int,
                  constraint demo_pk_pk primary key (id) ) ;

postgres=# \d demo*
            Table "public.demo"
 Column |         Type          | Modifiers 
--------+-----------------------+-----------
 id     | integer               | 
 text   | character varying(15) | 
 number | integer               | 

           Table "public.demo_pk"
 Column |         Type          | Modifiers 
--------+-----------------------+-----------
 id     | integer               | not null
 text   | character varying(15) | 
 number | integer               | 
Indexes:
    "demo_pk_pk" PRIMARY KEY, btree (id)

   Index "public.demo_pk_pk"
 Column |  Type   | Definition 
--------+---------+------------
 id     | integer | id
primary key, btree, for table "public.demo_pk"

I am using a Virtual Box VM with Oracle Linux 7.1 (64bit) using 512mb of memory and one vCPU for the Tests:

cat /etc/oracle-release 
Oracle Linux Server release 7.1
cat /proc/meminfo | head -1
MemTotal:         502612 kB
cat /proc/cpuinfo | grep proc
processor	: 0



Similar to Franck's plsql block I'll use a plpgsql for the first tests:

\timing on
show autocommit;
truncate table DEMO;
DO $$DECLARE
  l_people_array varchar(15)[12] := '{"Marc", "Bill", "George", "Eliot", "Matt", "Trey", "Tracy","Greg", "Steve", "Kristina", "Katie", "Jeff"}';
  n int;
BEGIN
  for i in 0..1e6 loop
     n:=trunc(random()*1000+1);
     insert into DEMO values( i , l_people_array[floor((random()*11))+1::int] , n );
  end loop;
END$$;

(The code for the demo_pk table is exactly the same, except for the table name). It does not exactly the same as Frank's plsql does but it does comparable things, I believe :)

The PostgreSQL version I'll use is 9.4.1:

postgres=# select version();
                                                         version                                                          
--------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.4.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
(1 row)


Lets start by executing the test without a primary key and thus no index:

postgres=# \i 1mio_rows_no_pk.sql
Timing is on.
 autocommit 
------------
 on
(1 row)

Time: 0.183 ms
TRUNCATE TABLE
Time: 71.339 ms
DO
Time: 6861.547 ms

Not so bad. Doing the same with a primary key and thus an index to maintain:

postgres-# \i 1mio_rows_pk.sql
Timing is on.
 autocommit 
------------
 on
(1 row)

Time: 0.494 ms
TRUNCATE TABLE
Time: 37.314 ms
DO
Time: 10900.631 ms

Around 3 seconds more, but again: Not so bad. As there is no "bulk collect" or "bulk insert" in PostgreSQL I can not do the same tests as Franck. But PostgreSQL knows a copy command, lets look at that. First I'll unload the table to a flat file so I have something to load:

postgres=# copy demo to '/home/postgres/demo.txt';
COPY 1000001
Time: 239.161 ms

Believe it or not: it took 240 ms to unload 1000001 rows from a table to a flat file. Now lets do the opposite and load the data back to the table using the copy command. The script used for doing this looks like this for both variations (with and without primary key):

\timing on
show autocommit;
truncate table demo;
copy DEMO from '/home/postgres/demo.txt';

First one without the primary key:

postgres=# \i 1mio_rows_no_pk_copy.sql
Timing is on.
 autocommit 
------------
 on
(1 row)

Time: 0.663 ms
TRUNCATE TABLE
Time: 32.026 ms
COPY 1000001
Time: 1877.480 ms

The second one with the primary key:

postgres=# \i 1mio_rows_pk_copy.sql
Timing is on.
TRUNCATE TABLE
Time: 33.370 ms
COPY 1000001
Time: 6227.844 ms

This makes a big difference: 1,8 seconds compared to 6,2 seconds. But still amazingly fast. For those who do know PostgreSQL a little bit and might wonder about fsync:

postgres=# show fsync;
 fsync 
-------
 on
(1 row)

If I turn this off, without the primary key:

postgres=# alter system set fsync='off';
ALTER SYSTEM
Time: 62.066 ms
postgres=# select pg_reload_conf();
LOG:  received SIGHUP, reloading configuration files
LOG:  parameter "fsync" changed to "off"
 pg_reload_conf 
----------------
 t
(1 row)

Time: 7.253 ms
postgres=# show fsync;
 fsync 
-------
 off
(1 row)

Time: 0.601 ms
postgres=# \i 1mio_rows_no_pk_copy.sql
Timing is on.
 autocommit 
------------
 on
(1 row)

Time: 0.277 ms
TRUNCATE TABLE
Time: 10.064 ms
COPY 1000001
Time: 611.455 ms
postgres=# \i 1mio_rows_pk_copy.sql
Timing is on.
TRUNCATE TABLE
Time: 11.768 ms
COPY 1000001
Time: 4674.273 ms

Around 0,6 seconds without the primary key and around 4 seconds with the primary key. But remember that you might loose data if fsync is off and the server crashed during the load.

Lets wait what Frank will do in the next post of his series.

Variations on 1M rows insert(2): commit write

Yann Neuhaus - Mon, 2015-05-04 10:39

In the previous blog post I measured the time it takes to insert 1 million rows. About 45 seconds when done row-by-row which goes down to 7 seconds when inserted in bulk. In both case, the commit was done only at the end. But in some cases you need to commit at each row, in order to make it visible to other sessions as soon as it is inserted (think of several sessions inserting concurrently). And commit makes it also durable - available even after an instance crashes - and that supposes that the change is written to disk. What's the cost for that?

Test case

Of course, I cannot do bulk insert for that as I want to commit for each row. I take the sane test as in the previous post. A table with a primary key (so only one index):

create table DEMO ("id" number , "text" varchar2(15), "number" number) ;

And the same PL/SQL except that I've added a COMMIT in the loop:

declare
 type people_array is varray(12) of varchar(15);
 people people_array := people_array( 'Marc', 'Bill', 'George', 'Eliot', 'Matt', 'Trey', 'Tracy','Greg', 'Steve', 'Kristina', 'Katie', 'Jeff' );
 people_count number :=people.COUNT;
 n number;
begin
 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit;
 end loop;
 commit;
end;
/

My test environment is still the Oracle Developer Day 11g VM with 1 CPU allocated from by VirtualBox.

 

simple commit in PL/SQL

Yes, this is a special case. I'm running that in a PL/SQL statement and PL/SQL has a special optimization here because the 'commit successful' message is given to the user only at the end. Then the intermediate commits are lighter than commits that return to client. However, look at that 1 million insert + commit execution:

PL/SQL procedure successfully completed.
Elapsed: 00:02:16.34

We are far from the 45 seconds of row-by-row inserts that were committed only at the end. Commit has a big overhead.

 

commit write wait immediate

If you run the inserts from a client, doing a user call for each row, the special optimization is not there. In that case the commit waits for log writer and returns only when log writer has acknowledged the write to disk (and to SYNC standby database if you are in DataGuard). We can do the same from PL/SQL if we issue a COMMIT WRITE WAIT IMMEDIATE instead of a simple commit. So that's a good way to compare the time without introducing additional roundtrip latency.

 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit write wait immediate;
 end loop;

When executing my procedure with the loop above, here is the time it takes:

PL/SQL procedure successfully completed.
Elapsed: 00:05:06.57

Time has doubled. This is the most expensive insert you can do. But it's also the only way if you want to:

  • Give commit feedback to the user at each commit
  • Ensure that the commit is done as quick as possible (this is the IMMEDIATE write) favoring response time over throughput.
  • Ensure that commit is durable (the D in ACID) meaning that we wait that the changes are on persistent storage (this is the WAIT write)

As we can see here, those requirements have a cost.

 

commit write wait batch

We can favor the throughput with the BATCH commit logging instead of IMMEDIATE.

 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit write wait batch;
 end loop;

The optimization is not very high here where I've only one session doing all those inserts:

PL/SQL procedure successfully completed.
Elapsed: 00:04:28.97

but the redo size is reduced. I plan to show more statistics and wait events about it in a future post. The time spend here in the WAIT commits is mostly in that orange 'log file sync' waits we don't like to see on enterprise manager screen. The only point where session process is waiting for work that is expected to be done asynchronously in background. At commit, the session waits for the redo to be persisted.

 

commit write nowait immediate

If you are ok to loose a transaction even when the recent commit was successful, then you don't have to wait. That seems to be heretic for ACID aficionados, but is totally acceptable for supporters of 'eventual consistency'. I prefer to think about it according to business requirements.

When something - external to the database - occurs when the commit is successful, then you must trust in ACID and you have to wait. When your customer withdraw money at the ATM then once they have their money you must be sure that the operation is commited in the database. A crash of the database should not give back the money on his account because he will not give back the bills.

However, when you insert events coming from sensors (you can think Big Data) you probably want to keep them for a while in the database but you can accept to loose a few of them in the rare case of server crash. Especially if that improves a lot the performance and the scalability of the process. Maybe you can accept to loose some transactions. Or you are able to load them back again after a crash.
Think of it like the NOLOGGING operations. It's not what you want by default, but if you are sure about the business requirements, and the recover procedure, then you have new was to improve performance and scalability.

 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit write nowait immediate;
 end loop;

And the improvement is obvious:

PL/SQL procedure successfully completed.
Elapsed: 00:02:08.75

When you don't wait, obviously, you improve the response time and the scalability.

 

commit write nowait batch

And we can batch the redo record, as we did above, in addition to nowait:

 for i in 0..1e6 loop
  n:=trunc(dbms_random.value(0,10000));
  insert into DEMO values( i , people( dbms_random.value(1,people_count) ) , n );
  commit write nowait batch;
 end loop;

And the improvement is obvious:

PL/SQL procedure successfully completed.
Elapsed: 00:01:45.33

Let the log writer do its job in batch, without waiting for its acknowledgement, and the row-by-row commit overhead is minimized. And it's ok as long as you know what you are doing:

  • You can manage the loss of your transaction in case of server crash. Either by re-executing them or because the missing changes have no consequence.
  • You do a WAIT commit when you return to the user a commit status, because he can do something you don't manage and which depend on that commit status.
statement level, session level, instance level

I show only the statement level way to manage commit wait and commit logging because it's a decision that must be done by developers. You can choose the same behavior with the commit_wait and commit_logging session parameters but that's probably not the right option. As I said before, the commit behavior probably depends on where it is in the code. The final commit before returning to the user is probably managed differently.
For the same reason, you probably don't change those parameters at PDB or instance level. But there are always exceptions. If you are doing an application migration that is using the row-by-row API to insert data into the new system, then maybe you can do all that in NOWAIT BATCH by setting instance parameters. If it fails, you restart from the beginning (flashback database helps here).

This kind of optimization exists also in SQL Server. David Bararin has blogged about MSSQL delayed durability previously. Want to meet in one day our experts from all technologies? Come to our Event In-Memory: boost your IT performance! where we talk about SQL Server, Oracle and SAP HANA.

And as we are talking about cross technology, stay tuned to see 1M row insert variations on other database systems...