I would like to be able to search the metadata of files (e.g. Title,
Keywords,
etc) that I have stored in a database. I have included the "metainfo" column
(image) in the full-text catalog. Running a freetexttable on the documents
table returns matching results for web pages (.htm, .aspx), however for
other documents (e.g. word, excel, etc), I am getting no results. Is this
an IFilter issue? Am I missing something here?
Thanks in advance!
This will work in SQL 2005, however you will not be able to search on only
the property. The contains search will be on all properties.
In SQL 2000 you will have to extract the properties and store them in
seperate columns and SQL FTI them.
"Jeremy" <jmaddreysp@.cox.net> wrote in message
news:%23K$d4z9BFHA.3940@.TK2MSFTNGP09.phx.gbl...
>I would like to be able to search the metadata of files (e.g. Title,
> Keywords,
> etc) that I have stored in a database. I have included the "metainfo"
> column
> (image) in the full-text catalog. Running a freetexttable on the
> documents
> table returns matching results for web pages (.htm, .aspx), however for
> other documents (e.g. word, excel, etc), I am getting no results. Is this
> an IFilter issue? Am I missing something here?
> Thanks in advance!
>
|||Jeremy,
Could you post the full output of -- SELECT @.@.version -- as this is most
helpful in understanding your environment and will help more quickly answer
your questions.
If you're using SQL Server 7.0, you are limited to extracting the text from
file and storing this text in a column defined with TEXT or NText datatype.
I'm assuming from your comments that you're using SQL Server 2000, you can
store binary files (MS word, MS Excel, Adobe PDF files, etc.) in an column
defined with an IMAGE datatype. Note, you will need to also define a "file
extension" column that identifies the file type (use char(3), varchar(4) or
sysname) and populate this column with "doc", ".doc" respectively.
Can I assume that if you search on keywords that you know to be in the body
of the documents (e.g. word, excel, etc) that you do get the row that
contains that keyword returned in your CONTAINS or FREETEXT query?
Thanks,
John
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
"Jeremy" <jmaddreysp@.cox.net> wrote in message
news:#K$d4z9BFHA.3940@.TK2MSFTNGP09.phx.gbl...
> I would like to be able to search the metadata of files (e.g. Title,
> Keywords,
> etc) that I have stored in a database. I have included the "metainfo"
column
> (image) in the full-text catalog. Running a freetexttable on the
documents
> table returns matching results for web pages (.htm, .aspx), however for
> other documents (e.g. word, excel, etc), I am getting no results. Is this
> an IFilter issue? Am I missing something here?
> Thanks in advance!
>
|||btw-you might want to check out
http://www.indexserverfaq.com/blobs.htm for more information on how to index
blobs.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:OeOh$1%23BFHA.2392@.TK2MSFTNGP14.phx.gbl...[vbcol=seagreen]
> This will work in SQL 2005, however you will not be able to search on only
> the property. The contains search will be on all properties.
> In SQL 2000 you will have to extract the properties and store them in
> seperate columns and SQL FTI them.
> "Jeremy" <jmaddreysp@.cox.net> wrote in message
> news:%23K$d4z9BFHA.3940@.TK2MSFTNGP09.phx.gbl...
this
>
|||John,
I am using SQL Server 2000. Here's my select statement:
SELECT * FROM FreeTextTable(Docs, MetaInfo, @.SearchTerm)
I have two columns defined on the document information, MetaInfo and Content
(as well as the extension column that stores the doc extension). When using
terms I know to be in the document, I can successfully return results from
the "Content" column, however, the "MetaInfo" column does not seem to return
any correct results (besides web pages). I have programmatically pulled the
column data and viewed the metainfo column in an application with no
problem. However, the freetexttable statement I'm using does not seem to be
finding the search term.
Ideas?
"John Kane" <jt-kane@.comcast.net> wrote in message
news:uxQNbJACFHA.1452@.TK2MSFTNGP11.phx.gbl...
> Jeremy,
> Could you post the full output of -- SELECT @.@.version -- as this is most
> helpful in understanding your environment and will help more quickly
answer
> your questions.
> If you're using SQL Server 7.0, you are limited to extracting the text
from
> file and storing this text in a column defined with TEXT or NText
datatype.
> I'm assuming from your comments that you're using SQL Server 2000, you can
> store binary files (MS word, MS Excel, Adobe PDF files, etc.) in an column
> defined with an IMAGE datatype. Note, you will need to also define a "file
> extension" column that identifies the file type (use char(3), varchar(4)
or
> sysname) and populate this column with "doc", ".doc" respectively.
> Can I assume that if you search on keywords that you know to be in the
body[vbcol=seagreen]
> of the documents (e.g. word, excel, etc) that you do get the row that
> contains that keyword returned in your CONTAINS or FREETEXT query?
> Thanks,
> John
> --
> SQL Full Text Search Blog
> http://spaces.msn.com/members/jtkane/
>
> "Jeremy" <jmaddreysp@.cox.net> wrote in message
> news:#K$d4z9BFHA.3940@.TK2MSFTNGP09.phx.gbl...
> column
> documents
this
>
|||Jeremy,
Actually, I have more follow-up questions as I want to be sure that I
understand your environment...
Could you confirm your FT-enabled table (Docs) column definitions? Content
defined with an IMAGE datatype. (Correct?) and MetaInfo defined with what
type of datatype? What is the content of the MetaInfo column? Is it document
property type data, such as Author, Company, Keywords, etc.? If so, how did
you import the text into this column?
Thanks!
John
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
"Jeremy" <jmaddreysp@.cox.net> wrote in message
news:uNVt0tHCFHA.3376@.TK2MSFTNGP12.phx.gbl...
> John,
> I am using SQL Server 2000. Here's my select statement:
> SELECT * FROM FreeTextTable(Docs, MetaInfo, @.SearchTerm)
> I have two columns defined on the document information, MetaInfo and
Content
> (as well as the extension column that stores the doc extension). When
using
> terms I know to be in the document, I can successfully return results from
> the "Content" column, however, the "MetaInfo" column does not seem to
return
> any correct results (besides web pages). I have programmatically pulled
the
> column data and viewed the metainfo column in an application with no
> problem. However, the freetexttable statement I'm using does not seem to
be[vbcol=seagreen]
> finding the search term.
> Ideas?
>
> "John Kane" <jt-kane@.comcast.net> wrote in message
> news:uxQNbJACFHA.1452@.TK2MSFTNGP11.phx.gbl...
> answer
> from
> datatype.
can[vbcol=seagreen]
column[vbcol=seagreen]
"file[vbcol=seagreen]
> or
> body
for
> this
>
|||On the Docs table, I have FT-enabled the following:
DocName nvarchar(128)
MetaInfo image(16)
Content image(16)
Here's an example of the MetaInfo that I have pulled out programmatically.
vti_cachedcustomprops:VX|vti_title
vti_modifiedby:SR|lab2003\\jmaddrey
vti_cachedtitle:SR|catsdogs
vti_title:SR|catsdogs
vti_author:SR|lab2003\\jmaddrey
When I perform the search "catsdogs", nothing is found. However, any text I
place within the body of the document is accurately returned.
In regards to importing the data into the metainfo column, I am working with
a Windows Sharepoint Services site, so the database was generated by
Sharepoint. I am not exactly sure how that data is extracted by the
Sharepoint framework.
Thanks!
Jeremy
"John Kane" <jt-kane@.comcast.net> wrote in message
news:enMKQrMCFHA.1404@.TK2MSFTNGP11.phx.gbl...
> Jeremy,
> Actually, I have more follow-up questions as I want to be sure that I
> understand your environment...
> Could you confirm your FT-enabled table (Docs) column definitions?
Content
> defined with an IMAGE datatype. (Correct?) and MetaInfo defined with what
> type of datatype? What is the content of the MetaInfo column? Is it
document
> property type data, such as Author, Company, Keywords, etc.? If so, how
did[vbcol=seagreen]
> you import the text into this column?
> Thanks!
> John
> --
> SQL Full Text Search Blog
> http://spaces.msn.com/members/jtkane/
>
> "Jeremy" <jmaddreysp@.cox.net> wrote in message
> news:uNVt0tHCFHA.3376@.TK2MSFTNGP12.phx.gbl...
> Content
> using
from[vbcol=seagreen]
> return
> the
to[vbcol=seagreen]
> be
most[vbcol=seagreen]
> can
> column
> "file
varchar(4)[vbcol=seagreen]
"metainfo"[vbcol=seagreen]
> for
Is
>
|||Anyone? Anyone?
"Jeremy" <jmaddreysp@.cox.net> wrote in message
news:uX$hrbTCFHA.1408@.TK2MSFTNGP10.phx.gbl...
> On the Docs table, I have FT-enabled the following:
> DocName nvarchar(128)
> MetaInfo image(16)
> Content image(16)
> Here's an example of the MetaInfo that I have pulled out programmatically.
> vti_cachedcustomprops:VX|vti_title
> vti_modifiedby:SR|lab2003\\jmaddrey
> vti_cachedtitle:SR|catsdogs
> vti_title:SR|catsdogs
> vti_author:SR|lab2003\\jmaddrey
> When I perform the search "catsdogs", nothing is found. However, any text
I
> place within the body of the document is accurately returned.
> In regards to importing the data into the metainfo column, I am working
with[vbcol=seagreen]
> a Windows Sharepoint Services site, so the database was generated by
> Sharepoint. I am not exactly sure how that data is extracted by the
> Sharepoint framework.
> Thanks!
> Jeremy
>
> "John Kane" <jt-kane@.comcast.net> wrote in message
> news:enMKQrMCFHA.1404@.TK2MSFTNGP11.phx.gbl...
> Content
what[vbcol=seagreen]
> document
> did
> from
pulled[vbcol=seagreen]
> to
> most
text[vbcol=seagreen]
you[vbcol=seagreen]
> varchar(4)
the[vbcol=seagreen]
that[vbcol=seagreen]
Title,[vbcol=seagreen]
> "metainfo"
however
> Is
>
|||Jeremy,
Sorry, for the late reply! I've had a lot of things going on lately, no
excuse, but just keeping me busy...
The thing about this is I don't have a real answer for you as metadata in
the MetaInfo column was generated by Sharepoint and I too am I am not
exactly sure how that data is extracted by the Sharepoint framework.
Fundamentally, and outside of Sharepoint, SQL FTS must have a valid file
extension, usually .doc for MS Word documents, or .pdf for Adobe PDF file
types as is with your Content column. This file extension is then used by
the MSSearch service and calls a daemon process to FT Index a specific file
type. However, there is no file extension column and file type that is
linked with the MetaInfo column (defined with the image datatype) in order
to successfully FT Index the metadata generated by SharePoint, I'm sad to
say...
At this point, I'd recommend that you open a support case with Microsoft PSS
Sharepoint support first (and if they can't help, then SQL Server support)
and have them help you through this understanding how to accomplish what
you're trying to do. If you don't mind, if they do provide a solution, could
you post it back here in this newsgroup for others to read?
Thanks,
John
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/
"Jeremy" <jmaddreysp@.cox.net> wrote in message
news:OhD8seVDFHA.512@.TK2MSFTNGP15.phx.gbl...[vbcol=seagreen]
> Anyone? Anyone?
>
> "Jeremy" <jmaddreysp@.cox.net> wrote in message
> news:uX$hrbTCFHA.1408@.TK2MSFTNGP10.phx.gbl...
programmatically.[vbcol=seagreen]
text[vbcol=seagreen]
> I
> with
> what
how[vbcol=seagreen]
When[vbcol=seagreen]
results[vbcol=seagreen]
to[vbcol=seagreen]
> pulled
seem[vbcol=seagreen]
is[vbcol=seagreen]
quickly[vbcol=seagreen]
> text
> you
an[vbcol=seagreen]
a[vbcol=seagreen]
> the
> that
> Title,
the[vbcol=seagreen]
> however
results.
>
|||Can you explain what you mean by pulled out programmatically?
I took your content, stored it in and image data type column and using a
value of txt for the document column type I was able to query for catsdogs.
Can you perhaps tell us what the value of your document type column is?
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Now available on Amazon.com
http://www.amazon.com/gp/product/off...?condition=all
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Jeremy" <jmaddreysp@.cox.net> wrote in message
news:uX$hrbTCFHA.1408@.TK2MSFTNGP10.phx.gbl...
> On the Docs table, I have FT-enabled the following:
> DocName nvarchar(128)
> MetaInfo image(16)
> Content image(16)
> Here's an example of the MetaInfo that I have pulled out programmatically.
> vti_cachedcustomprops:VX|vti_title
> vti_modifiedby:SR|lab2003\\jmaddrey
> vti_cachedtitle:SR|catsdogs
> vti_title:SR|catsdogs
> vti_author:SR|lab2003\\jmaddrey
> When I perform the search "catsdogs", nothing is found. However, any text
I
> place within the body of the document is accurately returned.
> In regards to importing the data into the metainfo column, I am working
with[vbcol=seagreen]
> a Windows Sharepoint Services site, so the database was generated by
> Sharepoint. I am not exactly sure how that data is extracted by the
> Sharepoint framework.
> Thanks!
> Jeremy
>
> "John Kane" <jt-kane@.comcast.net> wrote in message
> news:enMKQrMCFHA.1404@.TK2MSFTNGP11.phx.gbl...
> Content
what[vbcol=seagreen]
> document
> did
> from
pulled[vbcol=seagreen]
> to
> most
text[vbcol=seagreen]
you[vbcol=seagreen]
> varchar(4)
the[vbcol=seagreen]
that[vbcol=seagreen]
Title,[vbcol=seagreen]
> "metainfo"
however
> Is
>
Wednesday, March 7, 2012
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment