Re: MySQL import of XML
Date: Sat, 18 Jul 2015 20:38:44 +0200
Message-ID: <55aa9d34$0$2862$e4fe514c_at_news.xs4all.nl>
On 18-7-2015 19:47, Bjarne Jensen wrote:
> I have converted a pdf to XML - see truncated version:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE pdf2xml SYSTEM "pdf2xml.dtd">
> <pdf2xml producer="poppler" version="0.24.1">
> <page number="1" >
> <text top="836" >DatNum</text>
> <text top="836" > 1/53</text>
> <text top="47" >Manual:</text>
> <text top="59" >Code:</text>
> <text top="71" >Revision No:</text>
> <text top="104" >CHAPTER</text>
> <text top="104" >CODE</text>
> <text top="104" >PAGE</text>
>
> ... zipped ...
>
> <text top="800" >01SEP14</text>
> <text top="812" > GENERAL</text>
> </page>
> <page number="2" >
> <text top="47" >CHAPTER</text>
> <text top="47" >CODE</text>
> ... zipped ...
> <text top="251">03JUN15</text>
> <text top="251">CHG</text>
> </page>
> </pdf2xml>
>
> I need the value/content of
> - page number
> - top number
> - text.
>
> so...:
>
> mysql> CREATE TABLE clist ( number INT, top INT, text VARCHAR(40) );
> Query OK, 0 rows affected (0,06 sec)
>
> Then I picked this up from
> http://stackoverflow.com/questions/5491056
>
> mysql> LOAD XML LOCAL INFILE '/home/me/xml/CheckList_MOD_9.xml'
> INTO TABLE clist(number, top, text);
> Query OK, 0 rows affected (0,01 sec)
> Records: 0 Deleted: 0 Skipped: 0 Warnings: 0
>
> No error messages but also no imported data.
>
> What am I doing wrong?
>
I did something like that some time ago, let me give you what i did...
wget "http://feeds.feedburner.com/pfc" -O $tmp -a $log
mysql test <<EOQ
LOAD DATA INFILE
'$tmp' INTO TABLE pfc CHARACTER SET 'utf8' LINES STARTING BY '<item>' TERMINATED BY '</item>' (_at_tmp) SET title = ExtractValue(_at_tmp, '//title'), link = ExtractValue(_at_tmp, '//link'), guid = ExtractValue(_at_tmp, '//guid'), pubDate = ExtractValue(_at_tmp, '//pubDate'), content = ExtractValue(_at_tmp, '//content:encoded') ;
EOQ after this, i cen see the history on updatesof i.e. Notepad++ ;)
[root_at_test]> select title, min(pubDate) from pfc where title like '%Notepad++%' group by title;
+-----------------------------+---------------------------------+ | title | min(pubDate) | +-----------------------------+---------------------------------+ | [UPDATE] Notepad++ v6.6 | Fri, 02 May 2014 10:36:53 -0400 | | [UPDATE] Notepad++ v6.6.1 | Fri, 02 May 2014 10:36:53 -0400 | | [UPDATE] Notepad++ v6.6.2 | Fri, 09 May 2014 13:10:28 -0400 | | [UPDATE] Notepad++ v6.6.3 | Mon, 19 May 2014 13:30:24 -0400 | | [UPDATE] Notepad++ v6.6.4 | Thu, 05 Jun 2014 12:50:24 -0400 | | [UPDATE] Notepad++ v6.6.6 | Fri, 13 Jun 2014 16:42:37 -0400 | | [UPDATE] Notepad++ v6.6.7 | Wed, 25 Jun 2014 07:21:05 -0400 | | [UPDATE] Notepad++ v6.6.8 | Mon, 28 Jul 2014 12:26:53 -0400 | | [UPDATE] Notepad++ v6.6.9 | Mon, 08 Sep 2014 13:02:45 -0400 | | [UPDATE] Notepad++ v6.7 | Mon, 15 Dec 2014 14:47:13 -0500 | | [UPDATE] Notepad++ v6.7.1 | Mon, 15 Dec 2014 14:47:13 -0500 | | [UPDATE] Notepad++ v6.7.2 | Sun, 28 Dec 2014 06:13:28 -0500 | | [UPDATE] Notepad++ v6.7.3 | Sun, 28 Dec 2014 06:13:28 -0500 | | [UPDATE] Notepad++ v6.7.4 | Sat, 10 Jan 2015 19:33:54 -0500 | | [UPDATE] Notepad++ v6.7.5 | Wed, 11 Mar 2015 08:21:28 -0400 | | [UPDATE] Notepad++ v6.7.6 | Wed, 15 Apr 2015 14:57:17 -0400 | | [UPDATE] Notepad++ v6.7.7 | Wed, 15 Apr 2015 14:57:17 -0400 | | [UPDATE] Notepad++ v6.7.8 | Sun, 17 May 2015 02:07:44 -0400 | | [UPDATE] Notepad++ v6.7.8.1 | Sun, 17 May 2015 02:07:44 -0400 | | [UPDATE] Notepad++ v6.7.8.2 | Sun, 24 May 2015 04:00:05 -0400 | | [UPDATE] Notepad++ v6.7.9 | Fri, 12 Jun 2015 14:08:39 -0400 | | [UPDATE] Notepad++ v6.7.9.1 | Mon, 22 Jun 2015 15:09:56 -0400 || [UPDATE] Notepad++ v6.7.9.2 | Mon, 22 Jun 2015 15:09:56 -0400 |
+-----------------------------+---------------------------------+23 rows in set (6.31 sec) Received on Sat Jul 18 2015 - 20:38:44 CEST