Text this: An 18-DOF hand integrating force–position multimodal perception using a monocular camera