AW: UUID vs. Sequential ID as Primary

From: <ahmed.fikri_at_t-online.de>
Date: Sat, 13 Apr 2024 19:06:19 +0200 (CEST)
Message-ID: <1713027979938.206653.372c77a4ab4bff45303c8d47c7b16dc39df8b164_at_spica.telekom.de>



Thank you very much for this information        

-----Original-Nachricht-----
Betreff: Re: UUID vs. Sequential ID as Primary Datum: 2024-04-13T18:09:50+0200
Von: "Jonathan Lewis" <jlewisoracle_at_gmail.com> An: "list, oracle" <oracle-l_at_freelists.org>      

It's always good to hear about investigations that people are doing to understand what's going on.
I couldn't really comment with any confidence on the figures you found because I don't know what Postgres does with indexes - what the row overheads are and what happens on leaf block splits, in particular.  

The 17M sequence-based index would seem fairly reasonable from Oracle's perspective if the value were always stored as the full 8 bytes (which isn't what Oracle actually does) because there are 3 more bytes for row overhead, and 6 bytes holding the address of the<http://row.in> the table. Also for monotonic increasing values the index leaf blocks would "split" in what Oracle calls 90/10 mode but which really means add a new empty block and use that for the next value, while for randomly arriving data if a block is full and a value arrives that belongs in that block then Oracle does a "50/50" leaf block split and puts the row into the appropriate one of the two resulting half-full blocks. For randomly arriving values the blocks in the index tend towards being roughly 75% used with 25% free space.  

Regards
Jonathan Lewis    

On Sat, 13 Apr 2024 at 15:11, ahmed.fikri_at_t-online.de <mailto:ahmed.fikri_at_t-online.de> <ahmed.fikri_at_t-online.de <mailto:ahmed.fikri_at_t-online.de> > wrote:    

  I didn't notice the thing with 37 MB (thanks :-)). And it is the same for   Sequence that uses Long (8 Bytes). This give me a reason to conduct the   test with Oracle 21 and see if we get the same behaviour (as we are   working with this version).    

  I must also admit that my tests are initially rudimentary (And it's done   in my spare time, unfortunately not within the project I'm currently   working on). I am simply trying to understand why many teams in our   organization exclusively use UUIDs (and also in many other   organizations). But of course, the whole thing needs to be examined more   closely. My hope was that someone from this list would state whether it   is a bad or good practice to work with UUIDs, providing the reason.    

  When I worked on projects where only PL/SQL was used, this issue was not   questioned.    

  Regards
  Ahmed                    

  -----Original-Nachricht-----
  Betreff: Re: UUID vs. Sequential ID as Primary   Datum: 2024-04-13T15:35:25+0200
  Von: "Jonathan Lewis" <jlewisoracle_at_gmail.com
<mailto:jlewisoracle_at_gmail.com> >
  An: "list, oracle" <oracle-l_at_freelists.org
<mailto:oracle-l_at_freelists.org> >
               

  I raised the concurrency issue for two reasons:    

  1. The sequence was slower by 2.3 seconds on the creation of ids, but was faster by 9 seconds on saving data. That made sequences overall a better bet (despite your comment about uuid outperforming sequences) on a pure performance point.
  2. The index size for UUID was 37MB - but a UUID is only 16 bytes, so the index was more than double the size of the values it was storing. From an Oracle perspective that could have been a side effect of the randomness of the data (plus a component to do with row pointers) - but the size of the Oracle index was as I would have expected for a serial test of sequence values, and therefore seriously undersized for a test of reasonable concurrency. SO ... this led me to the point of wondering (in my ignorance) whether Postgres was basically behaving badly even in a serial test with a risk of getting MUCH worse in a concurrent test.

  I appreciate that you have a stated requirement to see the ID values   before they are used on the insert - in which case the locally generated   UUID is clearly the sensible option (without doing any testing) because a   round-trip to the database to get the server to generate a UUID has to be   more time-consuming than generating the UUID locally; but you still ought   to test whether the local option turns into a total disaster for saving   and retrieval at your expected level of concurrency.    

  Regards
  Jonathan Lewis        

  On Sat, 13 Apr 2024 at 14:10, ahmed.fikri_at_t-online.de
<mailto:ahmed.fikri_at_t-online.de> <ahmed.fikri_at_t-online.de
<mailto:ahmed.fikri_at_t-online.de> > wrote:
    If the UUID method already outperforms sequences in a single session,     then the performance of the sequence method will be even worse in     multiple concurrent sessions. Additionally, UUIDs are generated on the     client side. Therefore, I believe that conducting the test with only     one session is sufficient. My aim was simply to confirm that UUIDs are     a better choice for microservice architecture      

    Regards
    Ahmed

     
     
     
   

 



--
http://www.freelists.org/webpage/oracle-l
Received on Sat Apr 13 2024 - 19:06:19 CEST

Original text of this message