http://social.technet.microsoft.com/wiki/contents/articles/2367. How-to-use-lucene-net-with-windows-azure-sql-database.aspx. lucene.net with Windows Azure SQL databaseTable of Contents
- Summary
- Lucene.Net
- The Azure Library for lucene.net
- Using lucene.net to index SQL Database
- Searching the Lucene.Net Catalog
- References
Summarylucene.net is a. NET implementation of the Lucene Full-text search engine. This article describes how can I use Lucene.Net to index text data stored in Windows Azure SQL Database, and then perfor m searches against that data.
Note:this does not provide a integrated full-text search experience like the Full-text search in SQL Server. Lucene.Net is a external process is queried separately from SQL Database.
Note:this article relies on the "Azure Library for Lucene.Net" (https://azuredirectory.codeplex.com/) to store the Lucene. NET Catalog in a Windows Azure storage blob.
Prerequisites
- Windows Azure Account (offers and purchasing information at http://www.microsoft.com/windowsazure/offers/default.aspx)
- Visual Studio 2010
- Lucene.Net (http://lucenenet.apache.org/, both binary and source project is available)
- Azure Library for Lucene.Net (https://azuredirectory.codeplex.com/)
To use the Azure Library for Lucene.Net and lucene.net from a Visual Studio project, you must add a reference to both the Azuredirectory Project or assembly, and the Lucerne.net project or assembly. You must also add the following using statements to your project:using lucene.net; Using Lucene.Net.Store; Using Lucene.Net.Index; Using Lucene.Net.Search; Using Lucene.Net.Documents; Using Lucene.Net.Util; Using Lucene.Net.Analysis; Using Lucene.Net.Analysis.Standard; Using Lucene.Net.Search; Using Lucene.Net.QueryParsers; Using Lucene.Net.Store.Azure;
Lucene.Net
Lucene.Net is a. NET implementation of Lucene (http://lucene.apache.org/) and provides full-text indexing and search of D Ocuments. Documents is composed of multiple fields and does not have a predefined schema. When performing a query against the index, you can search for across multiple fields within a document. Lucene.Net doesn ' t directly integrate with SQL Database; Instead must perform a query against a database and construct a Document from the results, which are then cataloged by Lucene.Net. For more information in Lucene.Net, see http://lucenenet.apache.org/.
The Azure Library for lucene.net
This library allows expose blob storage as a Lucene.NET.Store.Directory object, which lucene.net uses& Nbsp;as Persistent Storage for its catalog. more information on the Azure Library for Lucene.Net, as well as the Latest version, can is found on the project homepage at https://azuredirectory.codeplex.com/.
The current version of The azure Library (as of require modification before using it In your solution. Specifically:
- It may launch A, the Visual Studio Project Conversion Wizard when launched.
- The reference to Microsoft.WindowsAzure.Storage could need to being deleted and recreated to point to the most recent version O f the Assembly.
- There is several Debug.WriteLine statements that should being converted to Trace.Write or another member of the Trace class As documented at Http://msdn.microsoft.com/en-us/library/ff966484.aspx. If you is not interested in diagnostic output, you can simply remove the debug.writeline statements.
Using the Library
The following code creates an Azuredirectory object and uses it as a parameter when creating the IndexWriter:
AzureDirectory azureDirectory =
new
AzureDirectory(
CloudStorageAccount.FromConfigurationSetting(
"Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"
),
"TestCatalog"
);
IndexWriter indexWriter =
new
IndexWriter(azureDirectory,
new
StandardAnalyzer(),
true
);
Using lucene.net to index SQL Database
As mentioned previously, Lucene.Net is isn't integrated directly with SQL Database and are based on indexing ' documents ' Contain multiple fields. In order to index data from SQL database, you must query the Database and create a new Document object for each row. Individual columns can then is added to the Document. The following code illustrates querying a SQL Database that contains information on individual bloggers, and then adding t He ID and Bio column information to the Lucene index using a indexwriter and Document:
// Create the AzureDirectory against blob storage and create a catalog named ‘Catalog‘
AzureDirectory azureDirectory=
new
AzureDirectory(CloudStorageAccount.FromConfigurationSetting(
"Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"
),
"Catalog"
);
IndexWriter indexWriter =
new
IndexWriter(azureDirectory,
new
StandardAnalyzer(),
true
);
indexWriter.SetRAMBufferSizeMB(10.0);
indexWriter.SetUseCompoundFile(
false
);
indexWriter.SetMaxMergeDocs(10000);
indexWriter.SetMergeFactor(100);
// Create a DataSet and fill it from SQL Database
DataSet ds =
new
DataSet();
using
(SqlConnection sqlCon =
new
SqlConnection(sqlConnString))
{
sqlCon.Open();
SqlCommand sqlCmd =
new
SqlCommand();
sqlCmd.Connection = sqlCon;
sqlCmd.CommandType = CommandType.Text;
// Only get the minimum fields we need; Bio to index, Id so search results
// can look up the record in SQL Database
sqlCmd.CommandText =
"select Id, Bio from bloggers"
;
SqlDataAdapter sqlAdap =
new
SqlDataAdapter(sqlCmd);
sqlAdap.Fill(ds);
}
if
(ds.Tables[0] !=
null
)
{
DataTable dt = ds.Tables[0];
if
(dt.Rows.Count > 0)
{
foreach
(DataRow dr
in
dt.Rows)
{
// Create the Document object
Document doc =
new
Document();
foreach
(DataColumn dc
in
dt.Columns)
{
// Populate the document with the column name and value from our query
doc.Add(
new
Field(
dc.ColumnName,
dr[dc.ColumnName].ToString(),
Field.Store.YES,
Field.Index.TOKENIZED));
}
// Write the Document to the catalog
indexWriter.AddDocument(doc);
}
}
}
// Close the writer
indexWriter.Close();
Note:the above sample returns all rows and adds them to the catalog. In a production application you'll most likely only want to add new or updated rows.
Searching the Lucene.Net Catalog
After your added documents to the catalog, you can perform a search against them using the indexsearcher. The following example illustrates how to create perform a search against the catalog for a term contained in the ' Bio ' fie LD and return the Id of that result:
// Create the AzureDirectory for blob storage
AzureDirectory azureDirectory =
new
AzureDirectory(CloudStorageAccount.FromConfigurationSetting(
"Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"
),
"Catalog"
);
// Create the IndexSearcher
IndexSearcher indexSearcher =
new
IndexSearcher(azureDirectory);
// Create the QueryParser, setting the default search field to ‘Bio‘
QueryParser parser =
new
QueryParser(
"Bio"
,
new
StandardAnalyzer());
// Create a query from the Parser
Query query = parser.Parse(searchString);
// Retrieve matching hits
Hits hits = indexSearcher.Search(query);
// Loop through the matching hits, retrieving the document
for
(
int
i = 0; i < hits.Length(); i++)
{
//Retrieve the string value of the ‘Id‘ field from the
//hits.Doc(i) document.
TextBox_Results.Text +=
"Id: "
+ hits.Doc(i).GetField(
"Id"
).StringValue()+
"\n"
;
}
Based on the Id, you can perform a query against SQL Database to return additional fields from the matching record.
References
- https://azuredirectory.codeplex.com/
- Http://www.logiclabz.com/c/create-lucene-index-in-c-for-given-sql-stored-procedure.aspx
- http://lucene.apache.org/
- http://lucenenet.apache.org/
- Http://www.ifdefined.com/blog/post/2009/02/Full-Text-Search-in-ASPNET-using-LuceneNET.aspx
- Http://blogs.msdn.com/b/windows-azure-support/archive/2010/11/01/how-to-use-lucene-net-in-windows-azure.aspx
How to use Lucene.Net with Windows Azure SQL Database