First, using the existing git-subtree tool, it is avoidable by using the --squash option to git-subtree. This "avoids" the problem, by simply suppressing all the commits. From the man page:
--squash
This option is only valid for add, merge, and pull commands.
Instead of merging the entire history from the subtree project, produce only a single commit that contains all the differences you want to merge, and then merge that new commit into your project.
It should be used always, or you will pull in the duplicates. You will not see any remote commit history this way.
If you want to retain remote commit history, duplicates are not unavoidable in some fundamental sense. They are just unavoidable using previously described, implemented (and maybe known) subtree methods.
git-alltrees
avoids these duplicates by using a more complex translation strategy.
To understand duplicates you have to understand what identifies a commit. They are identified by their hash which is a checksum that depends on essentially all of the data related to the commit. This includes a few things including the obvious commit content and also the parent ids stored in the commit. So if one commit changes, all the descendent hashes change.
With subtrees, when you push to a remote you obviously change the content. Files are removed and directories are changed. Hashes change. When you pull the commits back the directories can be changed back, but files are still missing.
git-alltrees re-associates and replaces the partial commits in the pulled branch with their original commits, thus restoring the original hash. Any new commits made from the remote will branch and merge naturally. The work is done using git-filter-repo.
I'm not trying to hide that this is my work, but it's work that was intended to answer exactly this question.