Wednesday, March 21, 2012

From varchar(max) to xml

I have a table with 2 columns. Column a(varchar(max)) and column b(xml).
Column a contains the following data:
Col1;Col2
New York;USA
Rio;Brasil
Tokio;Japan
The first line contains the column header, the following the data.
The data should be transferred to column b with the following xml-structure:
<Col1>New York</Col1><Col2>USA</Col2>
<Col1>Rio</Col1><Col2>Brasil</Col2>
<Col1>Tokio</Col1><Col2>Japan</Col2>
The number of columns and the column names are various.
Any ideas?
Thanks psychodad71
Message posted via webservertalk.com
http://www.webservertalk.com/Uwe/Forum...er-xml/200606/1It looks like this would require a lot of string manipulation. Although I
believe it could be done with T-SQL, the string functions are a little limit
ed.
I'd suggest you use the CLR.
I'll give it a shot myself when I get some time and I'll post an update.
Denis Ruckebusch
http://blogs.msdn.com/denisruc
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm
"psychodad71 via webservertalk.com" <u2248@.uwe> wrote in message
news:6232e7ee08949@.uwe...
>I have a table with 2 columns. Column a(varchar(max)) and column b(xml).
> Column a contains the following data:
> Col1;Col2
> New York;USA
> Rio;Brasil
> Tokio;Japan
> The first line contains the column header, the following the data.
> The data should be transferred to column b with the following xml-structur
e:
> <Col1>New York</Col1><Col2>USA</Col2>
> <Col1>Rio</Col1><Col2>Brasil</Col2>
> <Col1>Tokio</Col1><Col2>Japan</Col2>
> The number of columns and the column names are various.
> Any ideas?
> Thanks psychodad71
> --
> Message posted via webservertalk.com
> http://www.webservertalk.com/Uwe/Forum...er-xml/200606/1|||I can't think of a nice set-based method of doing this, but you could use
procedural code to do it. I agree with Denis, this should probably be done
in the front end. But that said, here's a little procedural sample. Note
that I don't have SQL 2005 on the computer I'm at right now so I had to put
this thing together on SQL 2000. It should run properly on SQL 2005 as
well. It's *extremely* procedural and assumes that the TestInput table has
a numeric id for each row, row 0 being the column names and all other rows
containing data. The procedural nature of this type of code makes me think
you'd be a lot better off doing it on the front end though:
-- Create a "Numbers" table and an inline UDF that uses it to parse your
-- comma-delimited string. Run this section one time.
SELECT TOP 10000 number = IDENTITY(INT, 1, 1)
INTO Numbers
FROM syscomments a1
CROSS JOIN syscomments a2
-- Add Primary Key to Numbers table
ALTER TABLE Numbers
ALTER COLUMN Number INT NOT NULL
ALTER TABLE Numbers
ADD CONSTRAINT PK_Numbers PRIMARY KEY (Number)
-- Create inline UDF
GO
CREATE FUNCTION dbo.ParseDelimitedList (@.list AS NVARCHAR(4000))
RETURNS TABLE
AS
RETURN (
SELECT Number, LTRIM(RTRIM(CASE Number
WHEN 1 THEN SUBSTRING(@.list, 1,
CASE WHEN CHARINDEX(';', @.list, Number + 1) > 0 THEN
CHARINDEX(';', @.list, Number + 1) - 1
ELSE LEN(@.list) - CHARINDEX(';', @.list, Number + 1)
END)
ELSE SUBSTRING(@.list, Number + 1,
CASE WHEN CHARINDEX(';', @.list, Number + 1) > 0 THEN
CHARINDEX(';', @.list, Number + 1) - Number - 1
ELSE LEN(@.list)
END)
END)) AS Value
FROM Numbers
WHERE (SUBSTRING(@.list, Number, 1) = ';' OR Number = 1)
)
GO
-- End of the Numbers table/UDF initialization.
CREATE TABLE TestInput([id] INT PRIMARY KEY,
a VARCHAR(8000),
b VARCHAR(8000))
INSERT INTO TestInput([id], a)
SELECT 0, 'Col1;Col2'
UNION SELECT 1, 'New York;USA'
UNION SELECT 2, 'Rio;Brasil'
UNION SELECT 3, 'Tokio;Japan'
DECLARE @.sql VARCHAR(8000)
DECLARE @.temp_str VARCHAR(8000)
DECLARE @.cols TABLE ([id_num] INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
[col_name] VARCHAR(8000))
SELECT @.temp_str = a
FROM TestInput
WHERE [id] = 0
INSERT INTO @.cols([col_name])
SELECT Value
FROM dbo.ParseDelimitedList(@.temp_str)
ORDER BY [Number]
DECLARE @.col_count INT
SELECT @.col_count = MAX([id_num])
FROM @.cols
DECLARE @.vals TABLE ([id_num] INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
[value] VARCHAR(8000))
DECLARE @.id INT
SELECT @.id = 1
DECLARE @.i INT
WHILE @.id <= (SELECT MAX([id]) FROM TestInput)
BEGIN
SELECT @.temp_str = a
FROM TestInput
WHERE [id] = @.id
IF NOT(@.temp_str IS NULL)
BEGIN
INSERT INTO @.vals([value])
SELECT [Value]
FROM dbo.ParseDelimitedList(@.temp_str)
ORDER BY [Number]
SELECT @.temp_str = ''
SELECT @.i = 1
WHILE @.i <= @.col_count
BEGIN
SELECT @.temp_str = @.temp_str + '<' +
(
SELECT [col_name]
FROM @.cols
WHERE [id_num] = @.i
) + '>'
SELECT @.temp_str = @.temp_str +
(
SELECT COALESCE([value], '')
FROM @.vals
WHERE [id_num] = @.i + (@.id - 1) * @.col_count
)
SELECT @.temp_str = @.temp_str + '</' +
(
SELECT [col_name]
FROM @.cols
WHERE [id_num] = @.i
) + '>'
SELECT @.i = @.i + 1
END
UPDATE TestInput
SET b = @.temp_str
WHERE [id] = @.id
END
SELECT @.id = @.id + 1
END
SELECT *
FROM @.vals
SELECT *
FROM TestInput
"psychodad71 via webservertalk.com" <u2248@.uwe> wrote in message
news:6232e7ee08949@.uwe...
>I have a table with 2 columns. Column a(varchar(max)) and column b(xml).
> Column a contains the following data:
> Col1;Col2
> New York;USA
> Rio;Brasil
> Tokio;Japan
> The first line contains the column header, the following the data.
> The data should be transferred to column b with the following
> xml-structure:
> <Col1>New York</Col1><Col2>USA</Col2>
> <Col1>Rio</Col1><Col2>Brasil</Col2>
> <Col1>Tokio</Col1><Col2>Japan</Col2>
> The number of columns and the column names are various.
> Any ideas?
> Thanks psychodad71
> --
> Message posted via webservertalk.com
> http://www.webservertalk.com/Uwe/Forum...er-xml/200606/1|||Ooops, the "syscomments" references will need to be changed for SQL 2005 to
"sys.comments".
"Mike C#" <xyz@.xyz.com> wrote in message
news:ORxRYYwlGHA.2056@.TK2MSFTNGP03.phx.gbl...
>I can't think of a nice set-based method of doing this, but you could use
>procedural code to do it. I agree with Denis, this should probably be done
>in the front end. But that said, here's a little procedural sample. Note
>that I don't have SQL 2005 on the computer I'm at right now so I had to put
>this thing together on SQL 2000. It should run properly on SQL 2005 as
>well. It's *extremely* procedural and assumes that the TestInput table has
>a numeric id for each row, row 0 being the column names and all other rows
>containing data. The procedural nature of this type of code makes me think
>you'd be a lot better off doing it on the front end though:
> -- Create a "Numbers" table and an inline UDF that uses it to parse your
> -- comma-delimited string. Run this section one time.
> SELECT TOP 10000 number = IDENTITY(INT, 1, 1)
> INTO Numbers
> FROM syscomments a1
> CROSS JOIN syscomments a2
> -- Add Primary Key to Numbers table
> ALTER TABLE Numbers
> ALTER COLUMN Number INT NOT NULL
> ALTER TABLE Numbers
> ADD CONSTRAINT PK_Numbers PRIMARY KEY (Number)
> -- Create inline UDF
> GO
> CREATE FUNCTION dbo.ParseDelimitedList (@.list AS NVARCHAR(4000))
> RETURNS TABLE
> AS
> RETURN (
> SELECT Number, LTRIM(RTRIM(CASE Number
> WHEN 1 THEN SUBSTRING(@.list, 1,
> CASE WHEN CHARINDEX(';', @.list, Number + 1) > 0 THEN
> CHARINDEX(';', @.list, Number + 1) - 1
> ELSE LEN(@.list) - CHARINDEX(';', @.list, Number + 1)
> END)
> ELSE SUBSTRING(@.list, Number + 1,
> CASE WHEN CHARINDEX(';', @.list, Number + 1) > 0 THEN
> CHARINDEX(';', @.list, Number + 1) - Number - 1
> ELSE LEN(@.list)
> END)
> END)) AS Value
> FROM Numbers
> WHERE (SUBSTRING(@.list, Number, 1) = ';' OR Number = 1)
> )
> GO
> -- End of the Numbers table/UDF initialization.
> CREATE TABLE TestInput([id] INT PRIMARY KEY,
> a VARCHAR(8000),
> b VARCHAR(8000))
> INSERT INTO TestInput([id], a)
> SELECT 0, 'Col1;Col2'
> UNION SELECT 1, 'New York;USA'
> UNION SELECT 2, 'Rio;Brasil'
> UNION SELECT 3, 'Tokio;Japan'
> DECLARE @.sql VARCHAR(8000)
> DECLARE @.temp_str VARCHAR(8000)
> DECLARE @.cols TABLE ([id_num] INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
> [col_name] VARCHAR(8000))
> SELECT @.temp_str = a
> FROM TestInput
> WHERE [id] = 0
> INSERT INTO @.cols([col_name])
> SELECT Value
> FROM dbo.ParseDelimitedList(@.temp_str)
> ORDER BY [Number]
> DECLARE @.col_count INT
> SELECT @.col_count = MAX([id_num])
> FROM @.cols
> DECLARE @.vals TABLE ([id_num] INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
> [value] VARCHAR(8000))
> DECLARE @.id INT
> SELECT @.id = 1
> DECLARE @.i INT
> WHILE @.id <= (SELECT MAX([id]) FROM TestInput)
> BEGIN
> SELECT @.temp_str = a
> FROM TestInput
> WHERE [id] = @.id
> IF NOT(@.temp_str IS NULL)
> BEGIN
> INSERT INTO @.vals([value])
> SELECT [Value]
> FROM dbo.ParseDelimitedList(@.temp_str)
> ORDER BY [Number]
> SELECT @.temp_str = ''
> SELECT @.i = 1
> WHILE @.i <= @.col_count
> BEGIN
> SELECT @.temp_str = @.temp_str + '<' +
> (
> SELECT [col_name]
> FROM @.cols
> WHERE [id_num] = @.i
> ) + '>'
> SELECT @.temp_str = @.temp_str +
> (
> SELECT COALESCE([value], '')
> FROM @.vals
> WHERE [id_num] = @.i + (@.id - 1) * @.col_count
> )
> SELECT @.temp_str = @.temp_str + '</' +
> (
> SELECT [col_name]
> FROM @.cols
> WHERE [id_num] = @.i
> ) + '>'
> SELECT @.i = @.i + 1
> END
> UPDATE TestInput
> SET b = @.temp_str
> WHERE [id] = @.id
> END
> SELECT @.id = @.id + 1
> END
> SELECT *
> FROM @.vals
> SELECT *
> FROM TestInput
>
>
> "psychodad71 via webservertalk.com" <u2248@.uwe> wrote in message
> news:6232e7ee08949@.uwe...
>|||psychodad71 via webservertalk.com wrote:
> I have a table with 2 columns. Column a(varchar(max)) and column b(xml).
> Column a contains the following data:
> Col1;Col2
> New York;USA
> Rio;Brasil
> Tokio;Japan
> The first line contains the column header, the following the data.
> The data should be transferred to column b with the following xml-structur
e:
> <Col1>New York</Col1><Col2>USA</Col2>
> <Col1>Rio</Col1><Col2>Brasil</Col2>
> <Col1>Tokio</Col1><Col2>Japan</Col2>
> The number of columns and the column names are various.
> Any ideas?
If your database supports access to external scripting languages, dump
column a out and pass it through the following filter
awk -F\; 'BEGIN {ORS=""}
{if(NR==1)n=split($0,gi);else{for(i=1;i<=NF;++i)print "<" gi[i] ">" $i
"</" gi[i] ">";print "\n"}}'
and read the result into column b. The GNU awk processor for Windows can
be downloaded from http://gnuwin32.sourceforge.net/packages/gawk.htm
///Peter|||Your table will need something to identity the first row, so I added an iden
tity
col. The first row inserted is assumed to be a column header.
CREATE TABLE #t1
(
id int identity primary key,
city VARCHAR(50),
xdata xml DEFAULT ''
)
INSERT INTO #t1 (city) values ('Col1;Col2;Col3;Col4')
INSERT INTO #t1 (city) values ('New York;Boston;Chicago;USA')
INSERT INTO #t1 (city) values ('Rio;Bla;Sao Paulo;Brasil')
INSERT INTO #t1 (city) values ('Tokio;Nagasaki;ABCD;Japan')
INSERT INTO #t1 (city) values ('Tokio;Nagasaki;Japan')
INSERT INTO #t1 (city) values ('Tokio;Nagasaki;ABCD;EF;Japan')
The city column has a compound value in it. You can use recursion to "unflat
ten"
this into a table with one row per city. Once you have done that you can
compose that table into xml, using recursion again.
-- start by making CTE of elementNames
WITH elementNames
AS
(
SELECT TOP(1) id, 1 as colNum, LEFT(city+';', CHARINDEX(';', city+';')-1)
as colName, RIGHT(city+';', LEN(city+';')-CHARINDEX(';', city+';')) as remai
n
from #t1
ORDER BY id
UNION ALL
SELECT t.id, en.colNum + 1 as colNum, LEFT(en.remain, CHARINDEX(';', en.rema
in)-1)
as single, RIGHT(en.remain, LEN(en.remain)-CHARINDEX(';', en.remain)) as
remain from elementNames en
JOIN #t1 AS t ON t.id = en.id
WHERE LEN(remain)>0
),
-- now recurse to "unflatten" the composite value in the city column
pos
AS
(
-- find the first city
SELECT id, 1 as colNum, LEFT(city+';', CHARINDEX(';', city+';')-1) as single
,
RIGHT(city+';', LEN(city+';')-CHARINDEX(';', city+';')) as remain from #t1
WHERE id not in (select id from elementNames)
UNION ALL
-- find the rest
SELECT id, colNum + 1 as colNum,
LEFT(remain, CHARINDEX(';', remain)-1) as single, RIGHT(remain, LEN(remain)-
CHARINDEX(';',
remain))
as remain from pos
where LEN(remain) > 0
and id not in (select id from elementNames)
),
-- now compose xml of of the expanded table
compose
as
(
SELECT p.id, p.colNum, CAST('<' + e.colName + '>' + p.single + '</' + e.colN
ame
+ '>'
as VARCHAR(MAX)) as xdata from pos AS p
JOIN elementNames AS e ON p.colNum = e.colNum
where p.colNum = 1
UNION ALL
SELECT p.id, p.colNum, CAST(c.xdata + '<' + e.colName + '>' + p.single +
'</' + e.colName + '>'
as VARCHAR(MAX)) as xdata from pos AS p
JOIN elementNames AS e ON p.colNum = e.colNum
JOIN compose AS c on p.colNum = c.colNum+1 and p.id = c.id
)
-- use composed xml to update the original table
UPDATE #t1 set xdata = (SELECT xdata from compose where #t1.id = compose.id
AND compose.colNum = (SELECT MAX(colNum) from compose as c WHERE c.id = #t1.
id)
)
Then to test the results:
SELECT * FROM #t1
1 Col1;Col2;Col3;Col4 NULL
2 New York;Boston;Chicago;USA <Col1>New
York</Col1><Col2>Boston</Col2><Col3>Chicago</Col3><Col4>USA</Col4>
3 Rio;Bla;Sao Paulo;Brasil <Col1>Rio</Co
l1><Col2>Bla</Col2><Col3>Sao
Paulo</Col3><Col4>Brasil</Col4>
4 Tokio;Nagasaki;ABCD;Japan <Col1>Tokio</
Col1><Col2>Nagasaki</Col2><Col3>ABCD</Col3><Col4>Japan</Col4>
5 Tokio;Nagasaki;Japan <Col1>Tokio</
Col1><Col2>Nagasaki</Col2><Col3>Japan</Col3>
6 Tokio;Nagasaki;ABCD;EF;Japan <Col1>Tokio</
Col1><Col2>Nagasaki</Col2><Col3>ABCD</Col3><Col4>EF</Col4>
Note that this works as even when the number of column headings do not match
the number of cities, though the results might not be what you want.
Dan

> I have a table with 2 columns. Column a(varchar(max)) and column
> b(xml). Column a contains the following data:
> Col1;Col2
> New York;USA
> Rio;Brasil
> Tokio;Japan
> The first line contains the column header, the following the data.
> The data should be transferred to column b with the following
> xml-structure:
> <Col1>New York</Col1><Col2>USA</Col2>
> <Col1>Rio</Col1><Col2>Brasil</Col2>
> <Col1>Tokio</Col1><Col2>Japan</Col2>
> The number of columns and the column names are various.
> Any ideas?
> Thanks psychodad71
>|||For some reason I see that some of the xml like stuff I have shown in this q
uery seems to be lost once it is posted. I have attached a text file version
of it.
Dan|||Alternatively, using a numbers table
as in http://www.aspfaq.com/show.asp?id=2516
you can do this
;
with Headers(id,rn,ColName)
as(
select id,
rank() over(order by Number),
ltrim(substring(city,
Number,
charindex(';',
city + ';',
Number) - Number))
from #t1
inner join Numbers on Number between 1 and len(city) + 1
and substring(';' + city, Number, 1) = ';'
where id=1),
Cities(id,rn,City)
as(
select id,
rank() over(partition by id order by Number),
ltrim(substring(city,
Number,
charindex(';',
city + ';',
Number) - Number))
from #t1
inner join Numbers on Number between 1 and len(city) + 1
and substring(';' + city, Number, 1) = ';'
where id>1)
update #t1
set xdata=(select cast('<'+h.ColName+'>'+c.City+'</'+h.ColName+'>' as
xml)
from Headers h
inner join Cities c on c.rn=h.rn
where c.id=#t1.id
for xml path(''))
select * from #t1
Regards
Mark|||Slight correction.
where c.id=#t1.id
for xml path(''))
should be
where c.id=#t1.id
order by c.rn
for xml path(''))

No comments:

Post a Comment