Discussion:
[BUG] Can't search for some square brackets in From:
Justin Humm
2018-10-08 22:20:38 UTC
Permalink
Hello,

I struggle with searching the From field of mails, that have square brackets
in it. The first instance of that problem I found is

From: =?UTF-8?Q?Shedhalle_T=c3=bcbingen_[Festival]?= <***@shedhalle.de>

In this case, on 0.27 I'm unable to find any mail with

from:"Shedhalle"
from:"TÃŒbingen"
from:"Festival"

However I can find mails with

From: "dependabot[bot]" <***@github.com>

by searching for

from:"dependabot"
from:"bot"

I managed to write a short test, that reproduces the problem. It fails on
master, but should pass imho. It is attached below this mail. For me it
generates the following output:


$ ./T080-search.sh

T080-search: Testing "notmuch search" in several variations
PASS Search body
PASS Search by from:
PASS Search by to:
PASS Search by subject:
PASS Search by subject (utf-8):
PASS Search by id:
PASS Search by mid:
PASS Search by tag:
PASS Search by thread:
PASS Search body (phrase)
PASS Search by from: (address)
PASS Search by from: (name)
FAIL Search by from: (name with brackets)
--- T080-search.13.expected 2018-10-08 22:02:44.157369241 +0000
+++ T080-search.13.output 2018-10-08 22:02:44.158369260 +0000
@@ -1 +1 @@
-thread:XXX 2000-01-01 [1/1] Search By From Name [with brackets]; search by from (name with brackets) (inbox unread)
+
PASS Search by from: (name and address)
PASS Search by from: without prefix (name and address)
PASS Search by to: (address)
PASS Search by to: (name)
PASS Search by to: (name and address)
PASS Search by to: without prefix (name and address)
PASS Search by subject: (phrase)
FAIL Search for all messages ("*")
--- T080-search.21.EXPECTED 2018-10-08 22:02:44.462375008 +0000
+++ T080-search.21.OUTPUT 2018-10-08 22:02:44.463375027 +0000
@@ -35,7 +35,7 @@
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; negative result (inbox unread)
thread:XXX 2000-01-01 [1/1] ***@example.com; search by from (address) (inbox unread)
thread:XXX 2000-01-01 [1/1] Search By From Name; search by from (name) (inbox unread)
-thread:XXX 2000-01-01 [1/1] Search By From Name [with brackets]; search by from (name with brackets) (inbox unread)
+thread:XXX 2000-01-01 [1/1] ; search by from (name with brackets) (inbox unread)
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; search by to (address) (inbox unread)
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; search by to (name) (inbox unread)
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; subject search test (phrase) (inbox unread)
PASS Search body (utf-8):
PASS headers do not have adjacent term positions
PASS parts have non-overlapping term positions
PASS parts do not have adjacent term positions


I could imagine, that the problem occurs not just with square brackets, but
with other special characters as well, but I didn't test for it. Unfortunately,
I have no knowledge of C and notmuch development. If anybody can give me a hint
or a general direction how to fix this, I'd be glad and could try it for myself.

I hope that I formatted this email correctly, never send a git commit via mail before


Best,
Justin



---
test/T080-search.sh | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/test/T080-search.sh b/test/T080-search.sh
index a3f0dead..f2d16f74 100755
--- a/test/T080-search.sh
+++ b/test/T080-search.sh
@@ -67,6 +67,11 @@ add_message '[subject]="search by from (name)"' '[date]="Sat, 01 Jan 2000 12:00:
output=$(notmuch search 'from:"Search By From Name"' | notmuch_search_sanitize)
test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Search By From Name; search by from (name) (inbox unread)"

+test_begin_subtest "Search by from: (name with brackets)"
+add_message '[subject]="search by from (name with brackets)"' '[date]="Sat, 01 Jan 2000 12:00:00 -0000"' '[from]="Search By From Name [with brackets] <***@example.com>"'
+output=$(notmuch search 'from:"Search By From Name [with brackets]"' | notmuch_search_sanitize)
+test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Search By From Name [with brackets]; search by from (name with brackets) (inbox unread)"
+
test_begin_subtest "Search by from: (name and address)"
output=$(notmuch search 'from:"Search By From Name <***@example.com>"' | notmuch_search_sanitize)
test_expect_equal "$output" "thread:XXX 2000-01-01 [1/1] Search By From Name; search by from (name) (inbox unread)"
@@ -139,6 +144,7 @@ thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; body search (phrase) (inbox un
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; negative result (inbox unread)
thread:XXX 2000-01-01 [1/1] ***@example.com; search by from (address) (inbox unread)
thread:XXX 2000-01-01 [1/1] Search By From Name; search by from (name) (inbox unread)
+thread:XXX 2000-01-01 [1/1] Search By From Name [with brackets]; search by from (name with brackets) (inbox unread)
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; search by to (address) (inbox unread)
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; search by to (name) (inbox unread)
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; subject search test (phrase) (inbox unread)
--
2.18.1
David Bremner
2018-10-08 23:54:51 UTC
Permalink
Post by Justin Humm
Hello,
I struggle with searching the From field of mails, that have square brackets
in it. The first instance of that problem I found is
In this case, on 0.27 I'm unable to find any mail with
from:"Shedhalle"
from:"Tübingen"
from:"Festival"
However I can find mails with
by searching for
from:"dependabot"
from:"bot"
I managed to write a short test, that reproduces the problem. It fails on
master, but should pass imho. It is attached below this mail. For me it
I don't think the issue with your test is the same problem. With that
sample data, 'from:"Search By From Name"' works fine to match the
thread. I'm not really sure why your test is failing, but in general it
doesn't really make sense to search for punctuation, unless you use a
(slower) regexp search. The details are in notmuch-search-terms(7),
under "Terms and phrases"

Can you provide a sample complete message where searching doesn't work?
I suspect it's most likely a problem with the encoded header. I tried
copying your "Shedhalle" from header into a test message and it seemed
like the terms for phrase search were not being generated at index
time. That looks like a bug, but I'd prefer to see a real message, if
possible.
Justin Humm
2018-10-10 10:44:28 UTC
Permalink
Post by David Bremner
I don't think the issue with your test is the same problem.
What let's me think that my test triggers a bug is that notmuch doesn't
show any information about the sender in the overview of all messages.
This is from the test log:

-thread:XXX 2000-01-01 [1/1] Search By From Name [with brackets]; search by from (name with brackets) (inbox unread)
+thread:XXX 2000-01-01 [1/1] ; search by from (name with brackets) (inbox unread)

Also, leaving the brackets out of the search query in the test doesn't
help either.

Anyway, I attached a real message.


Quoting David Bremner (2018-10-09 01:54:51)
Post by David Bremner
Post by Justin Humm
Hello,
I struggle with searching the From field of mails, that have square brackets
in it. The first instance of that problem I found is
In this case, on 0.27 I'm unable to find any mail with
from:"Shedhalle"
from:"TÃŒbingen"
from:"Festival"
However I can find mails with
by searching for
from:"dependabot"
from:"bot"
I managed to write a short test, that reproduces the problem. It fails on
master, but should pass imho. It is attached below this mail. For me it
I don't think the issue with your test is the same problem. With that
sample data, 'from:"Search By From Name"' works fine to match the
thread. I'm not really sure why your test is failing, but in general it
doesn't really make sense to search for punctuation, unless you use a
(slower) regexp search. The details are in notmuch-search-terms(7),
under "Terms and phrases"
Can you provide a sample complete message where searching doesn't work?
I suspect it's most likely a problem with the encoded header. I tried
copying your "Shedhalle" from header into a test message and it seemed
like the terms for phrase search were not being generated at index
time. That looks like a bug, but I'd prefer to see a real message, if
possible.
Loading...