Parsing files after changes

Coordinator
Jan 27, 2012 at 6:02 PM
Edited Jan 27, 2012 at 6:17 PM

Hi - I was doing the code review of the kostata change (f39f26cf46c6) and I saw that the parse method assigns new guid for every parse method call.

I started to wonder - how the plugin should be able to find out if the currently analyzed ProgramElement is a new one (new guid) or an old one? And when the old one - how to find out if it has changed?

We are indexing files in the background and I just want to make sure what are the rules of this step - if we decide to update only those elements that has changed since the last indexing call, we will have to fetch the related documents from the Indexer for the comparison (performance issue?). If we decide to use delete-insert rule when something has changed - what are the attributes that allows us to find the files for deletion?

 

Coordinator
Jan 27, 2012 at 6:35 PM
I think updating on a file bases is the simplest option, and I suppose we have a way of detecting that a file has changed. Not sure if I understood your point correctly.

On Jan 27, 2012, at 2:03 PM, lordlothar wrote:

From: lordlothar

Hi - I was doing the code review of the kostata change (f39f26cf46c6) and I saw that the parse method assigns new guid for every parse method call.

I started to wonder - how the plugin should be able to find out if the currently analyzed ProgramElement is a new one (new guid) or an old one? And when the old one - how to find out if it has changed?

We are indexing files in the background and I just want to make sure what are the rules of this step - if we decide to updated only those elements that has changed since the last indexing call, we will have to fetch the related documents from the Indexer for the comparison (performance issue?). If we decide to use delete-insert rule when something has changed - what are the attributes that allows us to find the files for deletion?



Coordinator
Jan 27, 2012 at 6:43 PM
kostata wrote:
I think updating on a file bases is the simplest option, and I suppose we have a way of detecting that a file has changed. Not sure if I understood your point correctly.

Ok - you're right - we know when the file has changed, but my primary question was: when there is a Lucene document for the class Class1 and the file where this file is defined has changed, what do we do in Lucene/Indexer? We compare document with the new one that was created after the new parsing and UPDATE thr appropriate fields (NOT guids) or DELETE the old document with all related documents (methods, properties etc) and insert a new one (NEW guid)?

Coordinator
Jan 27, 2012 at 6:50 PM
I guess it's a performance issue. I would tend towards the simplicity of the delete and replace approach, but I could be wrong. Not sure if this is a bad thing to do with Lucene.

On Jan 27, 2012, at 2:43 PM, lordlothar wrote:

From: lordlothar

kostata wrote:
I think updating on a file bases is the simplest option, and I suppose we have a way of detecting that a file has changed. Not sure if I understood your point correctly.

Ok - you're right - we know when the file has changed, but my primary question was: when there is a Lucene document for the class Class1 and the file where this file is defined has changed, what do we do in Lucene/Indexer? We compare document with the new one that was created after the new parsing and UPDATE thr appropriate fields (NOT guids) or DELETE the old document with all related documents (methods, properties etc) and insert a new one (NEW guid)?


Coordinator
Jan 27, 2012 at 11:58 PM

My thought was the same as Kosta... that we would delete and replace any updated file.  

FYI, here's what I'm planning on doing in the UI (which is in charge of monitoring the file updates):

1. When first opening a solution it sends all files to the indexer.  The indexer should determine whether each file needs to be indexed or not according to the file's timestamp.

2. When a solution is open and a file is saved the UI sends this file to the indexer to be indexed.  

Given this approach of the UI should we store timestamp with each element so that the indexer will know when to update a given element or ignore it?  We may be able to just store <timestame, filename> outside of the index and then ignore updates to files that haven't been changed since they've been updated.

Jan 28, 2012 at 9:17 AM

how does a notification from VS to our file monitor in the case of a source being included or excluded from a project? Is there a different event notification we should subscribe for this.

Coordinator
Jan 28, 2012 at 6:46 PM
hanuman wrote:

how does a notification from VS to our file monitor in the case of a source being included or excluded from a project? Is there a different event notification we should subscribe for this.

Good point, I created http://sando.codeplex.com/workitem/30 to track this.