upb/json: fix sign-extended index in jsondec_base64_tablelookup (#27215)

## Problem

`jsondec_base64_tablelookup()` in `upb/json/decode.c` indexes a 256-byte `signed char` table with `table[(unsigned)ch]`. Because C integer promotion of a `signed char` runs *before* the cast, any input byte with the high bit set (`0x80..0xFF`) is sign-extended to a negative `int` and then reinterpreted as a huge unsigned value (`0xFFFFFF80..0xFFFFFFFF`). The 256-byte table is then read approximately 4 GiB past its base, producing an out-of-bounds read.

The same pattern exists in the amalgamated copies that the Ruby and PHP extensions ship:
- `ruby/ext/google/protobuf_c/ruby-upb.c`
- `php/ext/google/protobuf/php-upb.c`

## Fix

Cast through `unsigned char` so the byte is zero-extended to `[0x80, 0xFF]` before being used as a table index. One-character change in three files.

```diff
-  return table[(unsigned)ch];
+  return table[(unsigned char)ch];
```

## Compatibility

- `ch` in `[0x00, 0x7F]`: `(unsigned)ch` and `(unsigned char)ch` produce identical values — no behavior change.
- `ch` in `[0x80, 0xFF]`: previously OOB read. The fix returns the table's `-1` sentinel, which `jsondec_base64()` already handles as "invalid base64 char" via `if (val < 0)`.

No public API changes, no new allocations, no new branches.

## Test plan

- Adds `optional bytes data = 11;` to `upb_test.Box` in `upb/json/test.proto` so a `bytes`-typed field is reachable from the existing `JsonDecode` helper in `decode_test.cc`.
- Adds `TEST(JsonTest, RejectsBase64WithHighBitBytes)` to `upb/json/decode_test.cc`, which decodes `{"data":"����"}` and verifies the decoder fails gracefully (no crash, returns nullptr). On the unfixed code under ASan this test exhibits the OOB read.
- Existing `upb/json/decode_test.cc` cases continue to pass.
- **Verified locally on `f331eba78` with `bazel test //upb/json:decode_test`**: with the fix all 5 tests pass; with the fix reverted (test kept), the new test fails with **SIGBUS** in `jsondec_base64_tablelookup` while the other 4 still pass — confirming the test exercises the exact code path the fix repairs.

## Files changed

| File | Change |
|---|---|
| `upb/json/decode.c` | One-character cast fix |
| `ruby/ext/google/protobuf_c/ruby-upb.c` | Same fix in amalgamated copy |
| `php/ext/google/protobuf/php-upb.c` | Same fix in amalgamated copy |
| `upb/json/test.proto` | `+optional bytes data = 11;` (test-only) |
| `upb/json/decode_test.cc` | Regression test |

If the project regenerates `ruby-upb.c` / `php-upb.c` from `upb/json/decode.c` automatically, please let me know and I will drop those two files from the PR.

## Reference

Reported via Google Bug Hunters / OSS VRP.

Closes #27215

COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/27215 from sukhoon0975:fix/upb-json-base64-sign-extend 18d34d7b070d2a4c3e2b80b18aba43d8132eddbc
PiperOrigin-RevId: 915053186
3 files changed
tree: c81dd078fd2368a08b3dfbb9af756464d7925e89
  1. .bazelci/
  2. .bcr/
  3. .github/
  4. bazel/
  5. benchmarks/
  6. build_defs/
  7. ci/
  8. cmake/
  9. compatibility/
  10. conformance/
  11. csharp/
  12. docs/
  13. editions/
  14. editors/
  15. examples/
  16. go/
  17. hpb/
  18. hpb_generator/
  19. java/
  20. lua/
  21. objectivec/
  22. patches/
  23. php/
  24. pkg/
  25. python/
  26. ruby/
  27. rust/
  28. src/
  29. third_party/
  30. toolchain/
  31. upb/
  32. upb_generator/
  33. .bazelignore
  34. .bazeliskrc
  35. .bazelrc
  36. .clang-format
  37. .gitattributes
  38. .gitignore
  39. .gitmodules
  40. .readthedocs.yml
  41. appveyor.bat
  42. appveyor.yml
  43. BUILD.bazel
  44. CMakeLists.txt
  45. CODE_OF_CONDUCT.md
  46. CONTRIBUTING.md
  47. CONTRIBUTORS.txt
  48. Disable_bundle_install.patch
  49. fix_permissions.sh
  50. generate_descriptor_proto.sh
  51. global.json
  52. google3_export_generated_files.sh
  53. LICENSE
  54. maven_dev_install.json
  55. maven_install.json
  56. MODULE.bazel
  57. PrivacyInfo.xcprivacy
  58. protobuf.bzl
  59. Protobuf.podspec
  60. protobuf_deps.bzl
  61. protobuf_release.bzl
  62. protobuf_version.bzl
  63. README.md
  64. regenerate_stale_files.sh
  65. SECURITY.md
  66. version.json
  67. WORKSPACE
  68. WORKSPACE.bzlmod
README.md

Protocol Buffers - Google's data interchange format

OpenSSF Scorecard

Copyright 2008 Google LLC

Overview

Protocol Buffers (a.k.a., protobuf) are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data. You can learn more about it in protobuf's documentation.

This README file contains protobuf installation instructions. To install protobuf, you need to install the protocol compiler (used to compile .proto files) and the protobuf runtime for your chosen programming language.

Working With Protobuf Source Code

Most users will find working from supported releases to be the easiest path.

If you choose to work from the head revision of the main branch your build will occasionally be broken by source-incompatible changes and insufficiently-tested (and therefore broken) behavior.

If you are using C++ or otherwise need to build protobuf from source as a part of your project, you should pin to a release commit on a release branch.

This is because even release branches can experience some instability in between release commits.

Bazel with Bzlmod

Protobuf supports Bzlmod with Bazel 8 +. Users should specify a dependency on protobuf in their MODULE.bazel file as follows.

bazel_dep(name = "protobuf", version = <VERSION>)

Users can optionally override the repo name, such as for compatibility with WORKSPACE.

bazel_dep(name = "protobuf", version = <VERSION>, repo_name = "com_google_protobuf")

Bazel with WORKSPACE

Users can also add the following to their legacy WORKSPACE file.

Note that with the release of 30.x there are a few more load statements to properly set up rules_java and rules_python.

http_archive(
    name = "com_google_protobuf",
    strip_prefix = "protobuf-VERSION",
    sha256 = ...,
    url = ...,
)

load("@com_google_protobuf//:protobuf_deps.bzl", "protobuf_deps")

protobuf_deps()

load("@rules_java//java:rules_java_deps.bzl", "rules_java_dependencies")

rules_java_dependencies()

load("@rules_java//java:repositories.bzl", "rules_java_toolchains")

rules_java_toolchains()

load("@rules_python//python:repositories.bzl", "py_repositories")

py_repositories()

Protobuf Compiler Installation

The protobuf compiler is written in C++. If you are using C++, please follow the C++ Installation Instructions to install protoc along with the C++ runtime.

For non-C++ users, the simplest way to install the protocol compiler is to download a pre-built binary from our GitHub release page.

In the downloads section of each release, you can find pre-built binaries in zip packages: protoc-$VERSION-$PLATFORM.zip. It contains the protoc binary as well as a set of standard .proto files distributed along with protobuf.

If you are looking for an old version that is not available in the release page, check out the Maven repository.

These pre-built binaries are only provided for released versions. If you want to use the github main version at HEAD, or you need to modify protobuf code, or you are using C++, it's recommended to build your own protoc binary from source.

If you would like to build protoc binary from source, see the C++ Installation Instructions.

Protobuf Runtime Installation

Protobuf supports several different programming languages. For each programming language, you can find instructions in the corresponding source directory about how to install protobuf runtime for that specific language:

LanguageSource
C++ (include C++ runtime and protoc)src
Javajava
Pythonpython
Objective-Cobjectivec
C#csharp
Rubyruby
Goprotocolbuffers/protobuf-go
PHPphp
Dartdart-lang/protobuf
JavaScriptprotocolbuffers/protobuf-javascript

Quick Start

The best way to learn how to use protobuf is to follow the tutorials in our developer guide.

If you want to learn from code examples, take a look at the examples in the examples directory.

Documentation

The complete documentation is available at the Protocol Buffers doc site.

Support Policy

Read about our version support policy to stay current on support timeframes for the language libraries.

Developer Community

To be alerted to upcoming changes in Protocol Buffers and connect with protobuf developers and users, join the Google Group.