Wiring together Lucene.net and EffizProz

Apr 20, 2010 at 9:00 PM

Hello, I am trying to wire together lucene.net and effiproz so that i can have a full text search feature for data being stored in the database. Primarily I'd be using this against clob data.

The approach i was going to take is:

Maintain an index "post commit" so that the index is updated in near realtime.
I was considering using triggers to replicate to a table the operation type and the key of the table the insert/update/delete occured on.

So that i can use a seperate thread to process the rows in this table post commit. (So that i don't have to deal with rollbacks in different sessions). Or do you have any other ideas?

(Or i could have the triggers automatically call CLR functions that do the maintanence, or ?)

One question i had is, "Is there a way to hook into "commit" so that i can notify a thread to wake up and do work?"

Also, i was thinking about putting together a some code that takes in the query and retuns a list of primary keys from a table, and the score of the hit.

It would be neat if the CLR could support a function that returns a table? So that I could so something like:

select *
from test t,
     fulltextsearch('test', 'help*') h
where t.key = h.key
order by h.score desc

Coordinator
Apr 21, 2010 at 3:33 AM

Hi,

Better use a separate thread to do processing and not do too much work inside the triggers.


Do you really need to hook into ”commit" ?  As all actions inside a trigger are bound to the enclosing transaction context, so you just need to create the processing thread connection with least read_commited isolation. Can't you use triggers to do wakeups as well? If you do need to hoop-up to commit or rollback you can do so through TransactionManagerMVCC class.


Supporting functions to return tables/cursors is in our to-do list.


-thanks

Apr 21, 2010 at 4:07 AM

Yeah, the plan was to use a separate thread.  The problem is that if that thread connects to the database using read_commited, and i wake it up to do processing via a CLR function in the trigger, it won't see the data that was just inserted. (Right?),(scenario: autocommit=false and  100 inserts then a commit) then i'd have to make the thread poll and wait until it could see the data to be able to index it.

However, if on commit, the thread was notified, it could wake up and check a table, (being used a queue), and it could then process all of the rows it could see via its read_commited connection.

When you eventually support returning tables/cursors, what data type would you plan to map it to in DOTNET land?(or have you not gave that any thought)

I'm probably going to start trying to put parts of this together, so if you know what interface, i'd have to support, it would be helpful.

 

Coordinator
Apr 21, 2010 at 6:40 AM

Sounds good.

We haven't thought about it yet. But curious to know your thoughts on how we should support returning tables/cursors?