Discussion:
question about deletion and counts
Jeff Templon
2018-11-03 11:39:52 UTC
Permalink
simeto:~> notmuch search --output=files tag:deleted | wc -l
666
simeto:~> notmuch search --format=text0 --output=files tag:deleted | xargs -0 rm

afterwards, from notmuch new:

No new mail. Removed 577 messages. Detected 89 file renames.

577 + 89 = 666 ... my guess is that there were 577 messages and 89 files
that represented duplicates of messages. But I didn't rename the files,
I deleted them. Should I worry? Why is the message inaccurate?

JT
Carl Worth
2018-11-07 22:05:24 UTC
Permalink
Post by Jeff Templon
No new mail. Removed 577 messages. Detected 89 file renames.
577 + 89 = 666 ... my guess is that there were 577 messages and 89 files
that represented duplicates of messages.
Yes. Among the messages you had tagged as deleted you had 577 unique
message IDs. In addition, you had another 89 files with message IDs that
were duplicates of one of the 577.
Post by Jeff Templon
But I didn't rename the files, I deleted them. Should I worry?
Nope. Nothing to worry about here.
Post by Jeff Templon
Why is the message inaccurate?
Because notmuch has a more narrow view of what a "rename" is than you
do.

A file rename is a high-level operation that will be seen by notmuch as
multiple operations seen over the course of a single run of notmuch
new:

1. A new file is added with a message ID that already exists in the
database

2. A file is removed with a message ID for which there are multiple
files in the database

But notmuch doesn't detect whether both of these operations are seen in
a single pass in order to detect a rename. Instead, what it is doing is
counting every occurence of (2) above as a rename. Here's what the code
looks like (notmuch-new.c:remove_filename):

status = notmuch_database_remove_message (notmuch, path);
if (status == NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID) {
add_files_state->renamed_messages++;
if (add_files_state->synchronize_flags == true)
notmuch_message_maildir_flags_to_tags (message);
status = NOTMUCH_STATUS_SUCCESS;
} else if (status == NOTMUCH_STATUS_SUCCESS) {
add_files_state->removed_messages++;
}

So, whenever removing a filename, it will either get counted as a rename
(if there is still at least one other filename in the database with the
same message ID), or it will get counted as a removal (if this was the
last filename for message ID).

I suppose you could come up with some other name for what it is
counting, such as "removals of duplicate messages" instead of "rename",
but that's what's happening.

I hope that helps.

-Carl
Jeff Templon
2018-11-08 10:01:19 UTC
Permalink
Hi Carl,

Thanks for your answer.
Post by Carl Worth
A file rename is a high-level operation that will be seen by notmuch as
multiple operations seen over the course of a single run of notmuch
1. A new file is added with a message ID that already exists in the
database
2. A file is removed with a message ID for which there are multiple
files in the database
But notmuch doesn't detect whether both of these operations are seen in
a single pass in order to detect a rename. Instead, what it is doing is
counting every occurence of (2) above as a rename. Here's what the code
status = notmuch_database_remove_message (notmuch, path);
if (status == NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID) {
add_files_state->renamed_messages++;
if (add_files_state->synchronize_flags == true)
notmuch_message_maildir_flags_to_tags (message);
status = NOTMUCH_STATUS_SUCCESS;
} else if (status == NOTMUCH_STATUS_SUCCESS) {
add_files_state->removed_messages++;
}
Perfect explanation, thanks.
Post by Carl Worth
I suppose you could come up with some other name for what it is
counting, such as "removals of duplicate messages" instead of "rename",
but that's what's happening.
Yes, that'd be my suggestion :-) It's one of my personal buttons that
sometimes get pushed "name is misleading". If you seriously consider
it, I'd suggest "file reassignments" instead of "file renames". A file
rename to me is

mv jeff.txt carl.txt

the file was named jeff.txt but was renamed to carl.txt. The case you
describe, a file with a certain name is either assigned to a messageID,
or de-assigned to that messageID - the actual file name is not changed,
as I understand it.

Anyway thanks for the explanation! Good that I don't need to worry.

BTW I've got integration between org and notmuch up and running now, I'm
really liking this capability.

JT

Loading...