Lucene Indexer input
file1.txt
brown fox jumps above lazy dog
file2.txt
red fox jumps above active cat
file3.txt
green fox lives in Pune
Lucene Internal Data Structures
Document Map (doc map)
1 = file1.text
2 = file2.text
3 = file3.text
Inverted Index (Lucene Index Structure / Dictionary Structure / MultiMap )
brown = 1
fox = 1,2,3
jumps = 1,2
above = 1,2
red = 2
active = 2
cat = 2
green = 3
lives = 3
pune = 3
Search Example 1 :
Query : Search for "fox"
Result : 1,2,3 doc ids
Search Example 2 :
Query :Search for "fox" AND "brown"
Result :
Result for "fox" = 1,2,3
Result for "brown" = 1
ANDing of result sets = (1,2,3) & (1) = 1 doc id
So query is in file1.txt
When we have to search on millions of files then it will be challenging to handle big result for that we will need distributed set up ..like using Elastic Search .