Squeezing spaces out of a string

From: --CELKO-- <joe.celko_at_northface.edu>
Date: 31 Jan 2004 16:09:00 -0800
Message-ID: <a264e7ea.0401311609.7c93100a_at_posting.google.com>



This problem comes up on newsgroup about once a year. Given a VARCHAR(n) column with words in it, how do you squeeze out the extra spaces, that each word is separated by only one space?

You can nest function calls up to 32 levels deep in SQL Server and they have REPLACE (<target string>, <old string>, <new string>) function, so answer is simply:

UPDATE Foobar

   SET col_x

  • REPLACE ( REPLACE ( REPLACE (
    ..
    REPLACE(col_x, ' ', ' ')
    ..
    ' ', ' ') ' ', ' ') ' ', ' ');

This is a LOT faster than hanging in a loop and it is pure SQL, not proprietary, procedural code. But now a matb problem for you: let col_x be VARCHAR(n). What is the optimal mix of replace function calls and what should they look like for the general case of (n)? There can be more than one word in the string, so you can have what typeseters call "rivers" in the string.

Prove it.

Let (j->k) mean "replaces (j) spaces with (k) spaces"

  1. (2->1) repeated log2(n) times?
  2. Is it best to always have (k->1)?
  3. (floor(sqrt(n)) -> 1) as a starter?
  4. Fibonnaci series?

I am in a weird mood this afternoon, so I'll offer a book as a prize for a proof.

--CELKO--



 Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, datatypes, etc. in your schema are. Received on Sun Feb 01 2004 - 01:09:00 CET

Original text of this message